Claude Council
A Decision Engine That Disagrees With Itself
Summary
A decision engine that pressure-tests a high-stakes question through thirteen roles instead of answering it once. A single model asked a hard question gives one perspective with one set of blind spots, and asking again returns the same answer with the same gaps. The council is built to make one model disagree with itself.
Five advisor agents analyze the question in parallel through fixed lenses, from failure analysis to execution. A quality gate screens their output, the responses are anonymized to letters and written to disk, and five peer reviewers stress-test the anonymized set, including one that attacks the strongest answer and one that defends the weakest. A chairman synthesizes a verdict, and a second gate audits the synthesis before release.
The result is a defensible recommendation with a paper trail: where the analysis converges, where it clashes, what every angle missed, and the single next action, delivered as a self-contained report alongside a full transcript. Thirteen roles, two audit gates, one verdict you can argue with.
Overview
- A multi-agent decision engine that runs a high-stakes question through thirteen roles before returning a verdict
- Five advisor agents analyze in parallel through fixed lenses: failure analysis, first principles, maximum upside, fresh eyes, and execution
- Five peer reviewers then stress-test the anonymized responses, including one that attacks the strongest answer and one that defends the weakest
- Two quality gates screen advisor output and audit the final synthesis before anything is released
- Advisor responses are anonymized to letters and written to disk before review, so the critique judges arguments rather than their source
- Output is a self-contained HTML report and a full markdown transcript, not a single opinion
Role
- Designer and builder of the full council architecture
- Wrote the thirteen agent definitions and their output contracts
- Designed the two-gate quality system and the anonymization-to-disk protocol
- Built the parallel dispatch and file-based state on Claude Code subagent orchestration
- Designed the chairman synthesis and the fixed verdict format
00. Table of Contents
One model, one answer, one set of blind spots.
02. The DesignHow to make a single model disagree with itself.
03. The Five AdvisorsFive fixed lenses analyzing the same question in parallel.
04. AnonymizationWhy the responses are stripped to letters and written to disk.
Stress-testing the strongest and weakest answers on purpose.
06. The Two GatesWhere the pipeline is built to stop itself.
07. The VerdictWhat the council returns, and why it can be trusted.
01. The Problem
A single model asked a hard question returns one answer shaped by one set of assumptions. Ask it again and it produces a close variant of the same answer, carrying the same blind spots, because the second pass reasons from the same priors as the first. For a low-stakes question that is fine. For a decision where being wrong is expensive, a single confident answer is the riskiest possible output, because its confidence is uncorrelated with whether it is right.
What is missing is disagreement. A real advisory board is useful precisely because its members see the same problem differently, and one person’s blind spot is another’s focus. The hard part is reproducing that with one model, which left alone collapses toward a single consensus voice no matter how many times it is asked.
Asking the same model twice gives you the same blind spot twice.
02. The Design
The council makes one model disagree with itself by assigning each pass a fixed lens it cannot abandon, then keeping the passes from seeing each other until each has committed to a position. The shape is set. An optional research pass feeds five advisors, a quality gate screens them, five peer reviewers stress-test the result, a chairman synthesizes a verdict, a second gate audits it, and two artifacts are written.
The cost is deliberate. A full run is thirteen to fourteen agent calls, which is why the council is reserved for decisions where being wrong is expensive rather than used as a default. It refuses the cases it is not built for, including factual lookups, creation tasks, and casual questions with no real stakes.
03. The Five Advisors
Five advisor agents analyze the framed question in parallel, each locked to one lens so the five reads cannot converge prematurely.
| Advisor | What it does |
|---|---|
| Failure Analysis | Finds the specific flaw that breaks the decision under real conditions. |
| First Principles | Strips the question back to what it is actually asking. |
| Maximum Upside | Surfaces the upside and adjacent opportunities nobody is naming. |
| Fresh Eyes | Approaches with zero prior context, catching what familiarity hides. |
| Execution | Ignores theory and asks whether this can be done and what the first step is. |
Each advisor returns a structured response with its lens, its primary read, its evidence, and a confidence level, so the synthesis later has something specific to weigh rather than five essays to average.
The advisors do not improvise. Each pulls the one or two frameworks most relevant to its lens from a curated library of twenty-five works before it reasons. The library spans:
- Decision science and forecasting: Kahneman, Tetlock, Annie Duke, calibration and pre-mortem methods
- Risk and the unknown: Taleb on fat tails and unknown unknowns
- Strategy and positioning: Rumelt, Thiel, Christensen, Playing to Win, Obviously Awesome
- Systems and causality: Meadows on leverage points, Pearl on cause and effect
- Multidisciplinary mental models: Munger’s latticework
- Power and human behavior: Machiavelli, Le Bon, the Elephant in the Brain, signaling and self-deception
- Execution under pressure: Goldratt, Horowitz, the Stoics
The orchestration makes the agents argue. This library is what they argue from.
04. Anonymization
Before any review happens, each advisor is randomly mapped to a letter from A to E, and the mapping plus the full responses are written to disk immediately. Writing the mapping down before review is a safeguard against a long session losing or confusing which response came from which lens.
The lens label is then stripped from each response before the reviewers see it. If a reviewer could see that response C came from the failure-analysis pass, the label would tell it what to conclude before it read a word. Removing it forces the reviewers to judge the argument on its content, not its origin.
Strip the label, and a reviewer has to judge the argument rather than its source.
05. The Five Reviewers
Multi-agent review has an obvious failure mode. Point one model at another model’s output and ask whether it is correct, and it tends to affirm what it reads, because nothing in the setup pushes against it. The council’s review is built to push the other way.
Let one model check another’s work and you have not automated verification, you have automated agreement.
Five peer reviewers then work over the anonymized set, each with its own job, so the critique is as structured as the analysis it examines.
| Reviewer | Lens |
|---|---|
| Convergence | Finds the agreement across responses that is more than surface overlap. |
| Gap Finder | Identifies what every response missed. |
| Skeptic | Argues against the strongest response. |
| Devil’s Advocate | Defends the weakest or most unpopular response. |
| Integrator | Finds where separate responses combine into something better. |
The skeptic and the devil’s advocate are the load-bearing pair. One attacks the answer most likely to be accepted on reflex, and the other rescues the answer most likely to be dismissed, so neither the popular choice nor the unpopular one escapes a fair test.
06. The Two Gates
Two audit points let the pipeline stop itself rather than carry a weak result all the way to the end.
The first gate runs after the advisors and before review. It checks each response for whether it committed to its lens and said something specific, and a failure surfaces to the user before the council spends five more agent calls reviewing thin material. The second gate runs after synthesis and before the report is written. It independently checks that the verdict represents the inputs rather than quietly favoring one pass, and a failure sends the synthesis back once with specific corrections.
Both gates are cheap relative to what they protect. Catching an off-angle advisor early, or a synthesis that overweights one voice, saves the whole run from producing a confident verdict built on a weak foundation.
07. The Verdict
The chairman synthesizes the verdict starting from the strongest disagreement rather than the easy consensus, on the principle that the place the analysis clashes is where the real decision lives. The output is fixed and identical in shape every run, which is what makes it usable under pressure.
- Where the passes agree, the convergence that is more than surface overlap
- Where they clash, the real tradeoff the decision turns on
- Blind spots, what every angle missed
- Recommendation, the defensible call
- One thing first, the single next action
Every run produces two artifacts: a self-contained HTML report for reading, and a full markdown transcript that preserves the anonymization mapping, all five analyses, all five reviews, the synthesis, and the audit result. The recommendation is not a black box. The entire reasoning path that produced it stays on the record, which is the difference between a verdict you can interrogate and an answer you have to take on faith.
A single model gives you an answer. The council gives you a recommendation you can interrogate.