Work goes where you point it.
Work goes where you point it.

The word "agent" gets used loosely now. Sometimes it means a system with real leverage: it browses, executes code, and takes actions across other systems. Sometimes it just means handing a task to a chatbot and expecting it to come back done.

I've been fumbling with both ends of that spectrum for months. Claude Code sessions that drift the moment intent gets fuzzy. Research tasks that return fluent prose aimed at the wrong problem.

Software that acts. Software that only complies.

The failures feel different in texture but come from the same place. I approach each task with the confidence of someone who has managed people and projects for years, and the work often comes back wrong. Or more precisely: pointed slightly elsewhere, solving a problem adjacent to mine.

Not wrong in the way a junior employee gets things wrong, where you can see the effort and trace the misunderstanding. Wrong in a cleaner, more disorienting way. The output was polished, assured, and aimed somewhere I never intended.

There's a specific quality to that moment. The document opens. The formatting is clean. The first sentence sounds right. And then, slowly, the feeling that something is off, like walking into a room and realizing the furniture has been moved slightly. Everything is where it should be. None of it is yours, or really right.

When this happens, someone on my team has a line I’ve come to deeply appreciate:

“The machine did exactly what we asked of it.”

It’s funny because it’s true, and uncomfortable because of what it implies. The failure wasn’t in the execution. It happened much earlier, in the handoff, in the space between what I meant and what I managed to say.

If you’ve spent time in organizations, the pattern is familiar. A project drifts off course, and when you reconstruct what happened, the breakdown is almost never technical. Someone had a goal in their head that never made it into words. Context that felt obvious went unshared. The work ran for weeks without a checkpoint, and by the time anyone looked closely, the cost of correction had compounded into the cost of starting over.

We call these execution failures, but they’re coordination failures. No one maintained the shape of the work while it was happening, and that’s why it fell apart.

I’ve watched this pattern play out in conference rooms and Slack channels for years. What I didn’t expect was how clearly AI agents would reproduce it, or how quickly.

When I hand off a task to an agent and the output drifts, moving cleanly in a direction I never intended, the drift becomes visible in minutes rather than weeks. There’s no social buffer, no status update that sounds productive, no reassuring meeting where everyone nods along. Just the work itself, returned to me, making the gap between intention and instruction impossible to ignore.

The machine did exactly what we asked of it.

The question is why we asked for that.

Many years ago, I sat through a presentation at work that everyone praised afterward. The speaker was talented, the kind of person who makes complexity feel manageable just by the way they hold a room. They painted a vision, told a story, and built momentum. People left energized.

But in the row behind me, a software engineer was muttering. Not heckling, just a low, running commentary I could only imagine, not hear. We tried that. Doesn't work. What about the thing from last quarter? While everyone else was swept up in the performance, this person was looking underneath it.

Meanwhile, I was nodding along with everyone else. I remember that now more clearly than anything the speaker actually said.

I wanted to hear what that software engineer was muttering about. Not because I enjoy cynicism, but because I’ve learned to trust people who can’t be impressed by delivery alone. The ones who sit with the substance even when the presentation is dazzling.

Confident speech in organizations hides problems. It smooths over uncertainty by making ambiguity feel temporary, already handled. A vague strategy sounds visionary if delivered well. Wandering projects can feel productive if someone add a bit of narration with enough certainty. We’ve built cultures around making confusion sound like progress. Most of us learn to be careful about voicing doubt when the room is nodding along.

The social cost of skepticism is real. Questioning a charismatic leader, especially in public, carries risk. So problems hide behind eloquence. They shelter in the gap between how persuasive something sounds and how well it actually holds together.

With people, that smoothing usually takes the form of promise. A confident assertion that the ambiguity will be resolved later. I’ve got it. I’ll figure it out. I’ll circle back once I understand more. Ambiguity is deferred, not resolved. The work keeps moving on the assumption that someone will correct course later.

With agents, the smoothing happens differently. There is no promise, only compliance. The ambiguity doesn’t get deferred. Whatever you failed to specify hardens immediately and is executed as output, whether or not it matches your intent.

When I first started using AI chatbots, then agents, for meaningful work, I assumed this dynamic would disappear. There are no egos to protect or reputations on the line. No relationships smoothing over gaps. Just the work, returned without performance. What I didn’t account for was how confidently the work would be presented.

AI agents don’t hedge. They don’t signal uncertainty through hesitation or tone. The prose arrives fluent and grammatically pristine. It sounds like it knows what it’s doing even when the substance has drifted completely off target. So the confident speech problem doesn’t vanish. It changes form.

Human charisma hides problems behind social performance. You want to believe the speaker partly because believing feels good, and partly because doubting has a cost. Performance and substance blur together.

AI confidence strips away the social layer but keeps the fluency. You’re alone with the output. No rooms to read or reputations to weigh. Just prose that sounds equally certain whether it’s right or wrong.

The question I keep sitting with is whether that makes problems easier to see. Or whether I just think it does because I want that to be true.

I don’t have a clean answer. But I’ve noticed something in my own habits.

When a colleague presents work confidently, I often have to work up a small courage to push back. There’s a relationship to consider and tone to manage. Even when I see gaps, I’m calculating how to name them without sounding like I’m attacking the person.

When an agent presents work confidently, that calculation disappears. I can be the cynic in the back row without social cost. I can look at fluent prose and ask, flatly, is this correct? I can say “the machine did exactly what we asked of it” and mean it as an indictment of my own instructions.

Something about the absence of a person on the other side makes scrutiny easier. The tools that used to compensate for fuzzy thinking (tone, reassurance, momentum, the ability to smooth over gaps with confidence) don’t help here. There’s no one to persuade, no room to read, no social energy to carry the work forward. Whatever structure exists has to live in the instructions themselves.

The barrier to questioning those instructions, and their results, is now much lower. Performance can’t dazzle me into silence. This feels like a small thing. I’m not sure it is.

A few weeks after the six-page detour, I tried again with a similar task. This time I wrote out exactly what I needed before opening the tool. I named the question. I explained why it mattered and how I’d use the output. I listed what I didn’t want. I broke the work into parts and asked for the first before the second.

What struck me afterward wasn’t the improvement in output. It was how rarely I bring that level of preparation to human collaboration. The same discipline that made the agent useful (clear outcomes, context supplied up front, work broken into checkable pieces, feedback arriving while the work is still in motion) is what makes delegation work anywhere. I know this.

Now I realize that a lot of what I used to call “common sense” in others often functioned as a buffer for my own lack of clarity. Anyone who has managed people or projects knows this intuitively. But knowing and practicing are different, and the gap between the two is where coordination failures live.

Agents compress the feedback loop on that gap. In human teams, drift is constant but rarely linear. People correct in small ways: a pause, a raised eyebrow, a clarifying question, a moment of hesitation that pulls the work back toward center.

With an agent, there is no self-correction. The first instruction sets the direction, and the work travels cleanly on that vector until it hits something solid. Without wobble.

Agents don’t give you weeks to discover that your initial direction was vague. You get minutes. They don’t let you blame the drift on interpretation.

The machine did exactly what you asked.

This compression can feel like a technology problem. The agent isn’t good enough, isn’t smart enough, doesn’t understand what we really meant. But the longer I work with these tools, the more I think the compression is the feature. It shows me, faster than any human collaboration could, exactly where my coordination habits break down.

There’s an irony here. For years, I’ve been suspicious of confident speech at work. The executive who makes uncertainty sound like vision. The project lead who narrates chaos as progress. I’ve learned to watch for the gap between how assured someone sounds and how solid the substance actually is.

Now I’m working with tools that produce confident speech by default. Tools that can’t hesitate or hedge. And I’m finding that the absence of a human performer makes it easier, not harder, to see through the confidence to what’s really there.

Maybe that’s the real mirror. Not just that agents expose coordination failures, but that they reveal how much social performance has been obscuring those failures all along. The problems were always there. They were hiding behind charisma, behind the discomfort of doubt, behind smooth delivery.

The machine did exactly what we asked of it. Now we have to sit with what we asked for.

The interfaces we have for this work don’t help yet. A command line built for issuing instructions is a poor place to manage anything that unfolds over time. You type a thing, you wait, you’re told something, you react. This model encourages one-shot delegation and delayed judgment.

Long-running work needs different rhythms. Places to pause and check orientation. Moments where drift can be caught early, while correction is still cheap. Attention on the shape of the work, not just the output.

I don’t know what those rhythms look like yet for agent collaboration. We’re early. The tools are changing faster than the practices around them. But I suspect the answer has less to do with better prompts and more to do with steadier attention. The kind that treats delegation as something you stay with rather than something you hand off.

It’s tempting to turn this into a question about whether the tools will improve.

Longer context. Better memory. Retrieval that doesn’t forget the one thing that mattered. Tool use that works most of the time. Multi-step reasoning that actually stops. Error handling that catches the obvious mistake before it ships. Guardrails that aren’t decorative. Orchestration that doesn’t hide drift. Lower latency. Lower cost. Better evaluation. Cleaner logs. Feedback loops that close instead of just existing.

The interesting question isn’t when or how all that happens, but whether we’ll get better at the same speed. Whether the compression these tools create in our work, the speed at which they surface coordination failures, will teach us something about how we work. Or whether we’ll find new ways to look away.

For now, the machine does exactly what I asked of it. The cursor blinks. The output waits. I'm still learning what I meant.

Part of the [Practice] series.
Related reading
Latest entries

Like this? Subscribe via email here.