The Yes-Man Problem: Why I Gave My Review Agents Conflicting Orders


One draft. Three reviewers. Three verdicts: MINOR ISSUES, NEEDS REVISION, APPROVED.

Same file, same day (all three reviews are dated 2026-04-01). The draft was a research piece on DORA metrics, sent to three agents whose only job was to tell me whether it was ready. Accuracy returned MINOR ISSUES. Voice, NEEDS REVISION. The Skeptic, APPROVED.

The obvious reading is that two of them were broken. Reject it. Three verdicts on one file does not mean two reviewers malfunctioned. It means each was doing a different job, and the jobs do not produce the same answer. The split is the post.

A reviewer that agrees with you is telling you about itself

Ask one general-purpose agent “is this good?” and you get a yes-man.

The mechanism is plain. A single reviewer with an open mandate optimizes for being helpful, and helpful drifts toward agreement. It finds a few things to tidy, confirms the rest, and hands back a draft that feels reviewed without being tested. That is not a review. It is a second opinion from someone who wanted you to like the first one.

I never ran a clean control here (one open-mandate reviewer on the same draft), so “you get a yes-man” is a claim from years of doing this, not a result from this trial. What the trial supports is narrower: one reviewer on one context budget cannot hold accuracy-rigor, voice-calibration, and unbriefed skepticism in attention at once without one crowding out the others.

You also cannot prompt the drift away. “Be harsh.” “Be critical.” “Find ten problems.” That produces theater: the comma it would move, the header it would rephrase, the sentence it finds “slightly redundant,” none of it the one structural objection that would actually stop you shipping. A reviewer told to be mean generates mean-flavored noise. It is still optimizing for looking like it worked, not working.

The fix is structure. You engineer the conflict by giving reviewers jobs that pull in different directions, then read the disagreement as signal. A reviewer forced to care about exactly one thing tells you about your work instead of its own incentives.

Conflicting orders: three mandates, each blind on purpose

Here are the three jobs.

The Accuracy agent checks every factual claim against a source-of-truth dossier and ignores whether the prose sings. It does not care if a sentence is clunky; it cares whether the number is right.

The Voice agent checks tone and register against the established voice of the site and ignores whether the numbers are right. It will not verify a statistic; it will tell you a sentence reads like a textbook.

The Skeptic gets the draft after a revision pass, with the author’s rationales stripped out. This is Iron Law 01, “Strip Rationales Before Review”: the reviewer must find its own reasons to trust the work. Hand it my justification for every choice and I have pre-loaded its conclusion. Strip the rationales and it earns its verdict from the artifact alone. It cannot be talked into agreement, because the author who would do the talking is not in the room.

The topology matters, so let me be precise instead of dressing it up. The pipeline has eight stages. Voice and Accuracy run in parallel at Stage 4 on the same draft. The Skeptic runs downstream at Stage 6, after the draft has already been revised. This is a staged committee with conflicting mandates, not three agents racing the same byte-identical file at the same instant, and I am not going to pretend it is. By the time the Skeptic sees the draft, the easy problems are gone, so it catches what survived a revision.

The blindness is the design. A reviewer allowed to care about everything cares about nothing hard enough to block you.

The collision in detail

Here is the falsifiable heart of it. Each lens caught what the others structurally could not.

Accuracy caught one real factual error. The draft said Accelerate codified the framework from “five years of survey data.” It did not. The dossier is unambiguous: four years, the 2014-2017 data window. The Accuracy agent flagged it with the exact source line and the exact fix, then verified roughly thirty other statistics one by one against the dossier (the multipliers, the adoption figures, the respondent counts), flagged an article title it could not confirm, and noted that one phrase (“not speed”) stepped a hair beyond what the dossier strictly supported. The Voice agent would never have caught the five-versus-four-years error: checking the year against a dossier is not its job and it was told not to do it.

Voice caught five tone tells, and no fact was in dispute. One honest wrinkle, since this is a post about reviewers: the review’s body says “about six targeted cuts” while its verdict says “Five,” and it listed exactly five flags, so five it is. A thesis road-map sentence in the intro that announces structure instead of moving. “Both readings are defensible” changed to “both are right,” because defensible is academic courtesy and right is a person with a view. “Prescriptive rather than descriptive” flagged as seminar vocabulary, not staff-meeting vocabulary. An end-of-article “Research Attribution Guide” that read like documentation, not a human. And one paragraph stacking three biases into a single dense sentence, told to give each its own beat. It also named three passages that nailed the voice and told me to preserve them exactly. The Accuracy agent had no opinion on any of this and was right not to.

The Skeptic returned APPROVED, then attached seven notes. It said the piece could ship, and then told me why it almost shouldn’t. “Google’s own internal research” is too vague to be load-bearing: name the study or downgrade the claim. The 2,293x recovery multiplier is two orders of magnitude larger than the other numbers, and presented without comment it risks undermining all of them by association, so it needs a parenthetical. The 2025 archetype model gets discussed at length but never shows up in the Monday-morning actions: operationalize it or trim it. A sentence pointing at “the limitations above” actually pointed below. Three more notes ran alongside, including a methodology challenge and a competing-frameworks gap. None were blockers. All were real.

Not all seven of those notes partition cleanly by lens, and I am not going to pretend they do. A couple (the vague sourcing, the missing competing-frameworks paragraph) are the kind of thing a good open-mandate reviewer could also have caught. The argument does not need every note to be lens-exclusive. It needs the year-error and the tone-slips to be, and they are: the accurate reviewer was deaf to tone, the tonal reviewer blind to the wrong year. Point one open-mandate reviewer at the same draft and those two are what gets waved through in a helpful summary.

Breaking the split

Nothing resolved the disagreement automatically. There is no merge step that takes MINOR ISSUES, NEEDS REVISION, and APPROVED and computes a ruling. A person read three conflicting verdicts and weighed them, and weighing starts with reading them correctly. A NEEDS REVISION made of five tone edits is not a rejection; it is five edits. An APPROVED carrying seven notes is not a blank check; it is a conditional yes with homework. And a real factual error gets fixed first: a wrong year is wrong whether the prose sings or not.

So the five-to-four-years fix was made. The Voice edits were applied as targeted edits, not a structural rewrite, because the agent had said the draft was not fighting the voice and a rewrite would have overreacted to its verdict. The Skeptic’s notes were weighed one at a time: some I took, because honest sourcing and a parenthetical on a wild number make the piece better, and the soft, taste-level ones I overruled as taste rather than error. I am accountable for that call. The post shipped.

Worth admitting a tension the Skeptic itself raised. It is a single agent with a single mandate, and it did not drift toward agreement: it returned APPROVED and then spent seven notes talking me out of shipping. So what prevents the yes-man is not the count of reviewers. It is the staging and the stripped context, both design choices. “Structure, not a better prompt” holds, but only because the Skeptic’s behavior is itself a structural result.

This is where two of the Iron Laws do their work. The Skeptic exists because of Law 12, “Every Phase Needs an Adversary”: a pipeline that agrees with itself is an echo chamber with commit access. And the resolution belongs to Law 16, “The Human Is the Architect”: breaking the split is mine to do, not the agents’. They can disagree for me. They cannot decide for me. The moment a machine collapses three verdicts into one ruling, I am back to a single reviewer, and that reviewer is a yes-man.

This is a leadership post wearing a tooling costume

The reason you separate the person who does the work from the person who blesses it is the same reason you give reviewers conflicting mandates. One person who writes the code and signs off on it is a yes-man with a keyboard, for the same mechanical reason a single open-mandate review agent is: they optimize for the work being finished, not for the work being right. A conscientious engineer reviewing their own pull request is still reviewing their own pull request. The drift is structural, and character does not fix a structural problem.

I am extending an old principle by analogy here, not reporting a controlled org result. Separation of duties is decades older than any agent. The narrower bet, that reviewers should get actively conflicting jobs rather than one general “is this good,” is what the agents made cheap to test, and what I would still want confirmed on a real team before treating as law. The base principle was never in doubt: every org that let one person build, test, and approve their own deliverable has known where that goes. This draft just printed the receipt in a form I could not look away from: three verdicts on one file, in plain text, in an afternoon instead of a postmortem.

The org chart has had this bug for as long as there have been org charts.

Monday morning

If you are using agents: stop asking one of them whether your work is good. You will get a yes-man and you will deserve it. Give two of them conflicting jobs (one that checks whether you are correct, one that checks whether you sound like yourself, and forbid each from caring about the other’s territory) and give a third permission to reject you with the rationales stripped out. Then do the part they cannot: breaking the split when they disagree.

If you have no agents at all and never will: find the place in your team where one person both does the work and blesses it. There is always one. Put an adversary in the loop with a mandate that conflicts with theirs, and protect that adversary’s right to say no.

The goal is not agreement. Agreement is the failure mode you are designing against. The goal is to make the disagreement show up in a review, on a Tuesday, in text you can still edit, not in front of the customer.