ADR-0014: M3 Deliberation Chamber
Status
accepted
Context
Founder currently performs manual courier work between LLMs for strategic questions: asks Claude, then Perplexity for second opinion, then ChatGPT, manually synthesizes, returns to Claude for implementation. This is slow, error-prone, inconsistent (founder may skip querying some LLMs), and does not capture an audit trail. At the same time, single-LLM answers risk model-specific biases that can propagate into Synth Nova decisions. A structured multi-LLM deliberation module would automate courier work while preserving the epistemic value of diverse perspectives.
M3 is the literal implementation of Constitution Law 7 (Verify — trust no single source) for strategic questions, and must operate within Law 5 (Human Veto) and Law 8 (Tokens are Capital) because multi-LLM sessions cost 3-5x a single query.
Decision
Add Module M3 — Deliberation Chamber to the Synth Nova roadmap. M3 provides structured multi-LLM deliberation with arbitrated synthesis.
Core commitments:
- v1 panel: Claude Sonnet 4, GPT-4, Gemini Pro. Three independent providers for minimum viable quorum.
- Arbiter: Claude Sonnet 4 (shared with existing Judge agent in synth-brain). Conflict-of-interest mitigations defined in MultiLLMDeliberationPolicy; arbiter evaluates methodology and evidence, not content authorship.
- Trigger model: pipeline agents or founder may propose; founder must explicitly approve each session (Constitution Law 5). System never auto-delegates.
- Two-phase deliberation: independent Phase 1 responses, then optional one-round cross-examination if divergence detected. No infinite loops.
- Honest uncertainty: arbiter may return “no consensus — founder review recommended” as a valid outcome. Forced synthesis when panelists genuinely disagree is prohibited.
- Strategic document and operational policy ratified now; implementation sprint parallel to M2 or after M2 based on capacity.
Alternatives Considered
Option A: Continue manual courier workflow
- Pros: zero infrastructure cost; fully under founder control.
- Cons: slow, error-prone, no audit trail, inconsistent (founder may skip LLMs). Doesn’t scale as pipeline matures.
Option B: Rely on single LLM (Claude) for all strategic questions
- Pros: simplest; cheapest; no new integrations.
- Cons: single-model bias risk; reduces validation; violates spirit of Constitution Law 7 (Verify). A single model’s blind spot becomes a systemic blind spot.
Option C: Ensemble voting (majority wins)
- Pros: simple aggregation; no arbiter cost.
- Cons: epistemically weak. Averaging or majority-voting LLM outputs loses the signal — minority view may be correct, and reasoning quality matters more than vote count.
Option D: Use external service (e.g., Consensus, LMSYS)
- Pros: no build cost.
- Cons: external dependency; per-call cost; lack of customization for Synth Nova integration points (CEO/Director/Judge pipeline hooks); audit trail lives outside our observability stack.
Option E: Structured deliberation with arbitrated synthesis ← chosen
- Pros: preserves verbatim panelist responses (audit trail); arbiter evaluates evidence quality, not vote count; honest-uncertainty outcomes allowed; integrates with M1/M2 pipelines; human-gated per Law 5.
- Cons: 3-5x cost vs single LLM; arbiter conflict-of-interest (Claude-as-both-panelist-and-arbiter) requires explicit mitigation; additional API account setup (OpenAI, Google).
- Why chosen: directly replaces manual courier work, operationalizes Law 7 for strategic decisions, and establishes an auditable multi-model consultation primitive that M1 and M2 can invoke.
Consequences
Positive:
- New strategic asset in manifest:
07-Roadmap/Deliberation-Chamber-Module.md. - New operational policy:
05-Rules/MultiLLMDeliberationPolicy.md. - Multi-source validation primitive available to M1 (Research confidence < threshold, Financial Modeler contradictions, Judge FAIL) and M2 (ambiguous team-fit signals).
- Audit trail per Law 6: verbatim panelist responses, arbiter reasoning, founder’s subsequent action all preserved.
- Potential evolution path: M3 can become external product feature (v2+) if internal use demonstrates value.
Negative / Trade-offs:
- New infrastructure: adapters for GPT-4 (OpenAI API) and Gemini (Google API) — new API keys, new cost tracking, new failure modes.
- Per-session cost ~100 — non-zero recurring expense.
- Arbiter conflict-of-interest requires active monitoring (if Claude-panelist view wins >50% of arbitrated cases, re-evaluate arbiter choice).
- Founder approval friction per session (intentional per Law 5, but real cost in flow).
Mitigations:
- Rate limits and cost caps defined in MultiLLMDeliberationPolicy (10 sessions/day, 100/month budget).
- Conflict-of-interest mitigations (explicit arbiter system prompt, self-check, periodic founder review).
- Integration Triage Policy (IntegrationTriagePolicy) governs any v2+ panelist additions (Grok, DeepSeek, Perplexity, Mistral, Llama).
- Honest-uncertainty mandate prevents false-consensus outputs.
Follow-ups
- OpenAI API account setup (billing separate from Anthropic).
- Google Gemini API account setup.
- Detailed agent specs during M3 implementation sprint (panelist adapters, divergence detector, arbiter orchestrator).
- Arbiter calibration process (founder reviews first 10-20 sessions for arbitration quality).
- CLI-only v1 vs Streamlit integration decision.
- Per-agent ADR for each panelist adapter during implementation.
- Review: if Claude-panelist view wins >50% of arbitrated cases, re-evaluate arbiter choice (avoid Claude echo chamber).
- Integration hooks spec for M1 (Research, Financial Modeler, Judge) and M2 (Team Assessment) triggers.
References
- Constitution — supreme governance (Laws 5, 6, 7, 8 directly apply).
- Deliberation-Chamber-Module — M3 strategic document (07-Roadmap).
- MultiLLMDeliberationPolicy — operational rules for M3 (05-Rules).
- Niche-Evaluation-Module — M1 strategic document; M3 may be invoked from M1 pipeline.
- Team-Implementation-Module — M2 strategic document; M3 may be invoked from M2 pipeline.
- ADR-0011-integration-triage-policy — governs v2+ LLM provider additions.
- ADR-0012-constitution — Constitutional foundation.
- ADR-0013-m2-team-implementation-navigator — prior module in the M-series.
- ObservabilityContract — retention of Chamber transcripts.
- DecisionRights — approval escalation rules for per-session cost.
- ConflictResolution — Judge agent handles intra-system conflicts; M3 handles external-knowledge conflicts.
- Decision-Log — ADR index.