Where the
Human Goes
I went from asking AI for answers to directing teams of agents. This is the working philosophy I'm writing as I go — hypotheses, not commandments.
In March 2026, I stopped asking AI for answers and started running it like a workforce. The shift came from an unlikely seat — a place of curiosity instead of a company-wide mandate. It wasn't a bigger model or a cleverer prompt. It was a flip in posture, from tool to team: I stopped typing into a chatbot and started staffing problems out instead. And I set one rule first — whatever the agents produce, I will never say: "Sorry, this was sent by Claude, I don't know why it said that."
Same task — first as the worker, then as the director. Watch what changes, and what doesn’t.
You’re the human in a chatbot session. Each click takes your next turn — watch how often the work bounces back to you.
This is the seat that doesn’t scale.
What you’re producing →
Now you direct. You brief the team once — they run the grind, you own the result.
The system ran the grind. You directed it — and the brief still has your name on it.
The four triggers
A chatbot stops scaling the moment the grindwork outgrows a single conversation. I actively judge four aspects to know when I've crossed that line — scale, complexity, the gap between the task and my own expertise, and sheer repetitiveness. Any one can tip me toward an agent team; usually it's a combination, the way a kitchen tips from one cook to a brigade when the orders, the courses, and the covers all climb at once. None of that, though, changes who owns the result.
You can delegate the work. You never delegate the ownership.
That's the engine under everything that follows. Over-automation is dangerous not because the model is bad, but because it tempts you to disown your own output — to treat the thing with your name on it as something that merely happened near you, rather than something you made. For me, accountability is an important professional value.
Delegating into the blind spot
People tend to delegate hardest where they can least judge the result — precisely where blind delegation does the most damage. The fix isn't to delegate less. It's to decompose: a trusted human at the input, a comprehending human at the output, the agent working only the verified middle.
To use a specific project example: an executive-level report on system reliability. Real data checked by people I trust goes in — not whatever the model can scrape or guess. Nothing comes out until a human who understands the material has reviewed it, in a voice I've coached the system to write in — despite having no data-visualization background of my own. The middle is "verified" only because both ends are.
Trusted data in
Real data, checked by people I trust — not whatever the model can scrape or guess.
The verified middle
The agent re-presents the trusted data, flags its own discrepancies and questions, and writes in my coached voice.
Comprehending review out
Nothing ships until a human who understands the material has reviewed it.
Underneath that sits a quieter move. Before any of it, I do the legwork on establishing the tone and format and gathering feedback early on myself — not from distrust, but to build my own sense of the terrain, enough to ground the agents and steer them when they drift.
The verified middle gives you the shape of delegation. How hard you watch it is a separate dial — a tension, not a switch.
How hard you watch
The gate to step further out of the loop isn't the agent getting better. It's the downstream humans understanding the artifact better. That reframes oversight entirely. Oversight is misinterpretation risk weighed against output quality — a communication problem AI amplifies, not a model problem. Humans misread human-made work too; the agent just produces more of it, faster — and with prettier formatting, it looks more believable.
Which creates an honest pain point. When most outputs from a single agentic pipeline look nearly identical, it's hard to find the needles in the haystack. So the first thing I coach the system to do is write in my voice, and flag its own discrepancies and questions — not to step out of the loop, but to make staying in it bearable. That coaching makes review easier. It does not, by itself, earn the agent more rope.
Five agent outputs, near-identical — one hides a discrepancy. With flagging off, you have to open each to check. Go find it.
Self-flagging turns a hunt into a glance — it makes staying in the loop bearable. It does not, by itself, let me step out of the loop.
The rope comes later, and it's a different stage. Agents are how I beat human-scale limits, yet I hold the output to a human-scale accountability standard — and reconciling those is a deliberate sequence: coach until review is easy, then, only where the stakes allow, extend trust and step back. Coach, then trust. Where the platform caps how much I can customize or coach an agent, I can't trust it as far, so I stay in heavier review — the tooling quietly sets the ceiling.
Let's explore a personal project: the pipeline I run for my own job-search materials is higher stakes than any work report, not lower. It represents me. It bends my career. It helped me understand exactly which work I'll never hand over.
Gate 1 — trusted data in (human)
Reliability metrics come from the engineering data owners, human-verified at source — not whatever the model can scrape or guess. The agent only ever re-presents data someone else already owns, which is exactly why this side stays lighter-touch.
Gate 2 — my comprehending review (human)
In the verified middle the agent re-presents the data into an exec-digestible format, flags its own discrepancies (a metric that moved unexpectedly), and source-checks every figure against the telemetry — in a voice I've coached it on. I read, correct, and sign off. Coach-then-trust: as review gets easy and the stakes allow, I step back; the platform ceiling caps how far.
Gate 1 — proceed after research (strategic)
The agent fetches the role + company, structures it into a research profile, and auto-flags things like Portuguese-first requirements or contractor-vs-CLT. Then it stops. I decide whether the role is worth pursuing — only I know my pipeline priorities.artifact · company research profile + summary
Gate 2 — CV review (accuracy)
It reframes my master CV to the role and exports a tailored doc. I verify every metric and ownership claim before it goes further — it auto-flags anything it couldn't source and any EPM-vs-PM distinction risk.artifact · tailored CV (.docx + .pdf)
Gate 3 — cover-letter review (voice)
It drafts from the tailored CV + my tone profile — never inventing experience. I read it for voice and authenticity, because that can't be delegated: it has to sound like a writer who was stoked on representing me.artifact · cover letter (.docx)
Gate 4 — submission (control)
The agent does not submit. The irreversible action — and the final accuracy, strategy, and timing call — is mine alone.artifact · none — I press send
The thing that represents me has more human gates — not fewer.
What I'll never automate
The line I won't cross isn't (AI) competence. It's care, it's representation. Three things stay human.
Stake
I am experimenting with allowing my agents to write my cover letters. The output was accurate but indifferent — factually me, and hollow.
Identity
An agent can imitate them, but it can't be me. (Only genuine photos and voice ship — never an AI likeness.)
Growth
My Executive MBA assignments are mine to grind through — I'm there to learn, and automating it eats the growth.
An agent can't care, can't be me, or you, and can't become us through the work. And because the line of what these models can do keeps moving as the technology and the ethics around it move, the rules are going to stay provisional.
A working philosophy that can't change isn't a philosophy — it's a superstition. So this framework adapts as the technology does, and as the realities of ethics, alignment, and data privacy evolve underneath it.
As agents get better at execution, the value of human work doesn't shrink — it expands. We get more space and time to think deeply and design experiments. The human isn't automated away. The human moves up — more capable, more creative, more themselves.
This page was made this way
So I made this one the way I work — grounded by a human interview (with me), engineered by a dedicated agent team, and gated by four required checkpoints where I reviewed the output, fed back, and changed it. Not rubber-stamped — changed. Tap a checkpoint.
The work here was delegated. The ownership never was. That's the whole reason I will never say:
"Sorry, this was sent by Claude, I don't know why it said that."