ADR-0014: M3 Deliberation Chamber

Status

accepted

Context

Founder currently performs manual courier work between LLMs for strategic questions: asks Claude, then Perplexity for second opinion, then ChatGPT, manually synthesizes, returns to Claude for implementation. This is slow, error-prone, inconsistent (founder may skip querying some LLMs), and does not capture an audit trail. At the same time, single-LLM answers risk model-specific biases that can propagate into Synth Nova decisions. A structured multi-LLM deliberation module would automate courier work while preserving the epistemic value of diverse perspectives.

M3 is the literal implementation of Constitution Law 7 (Verify — trust no single source) for strategic questions, and must operate within Law 5 (Human Veto) and Law 8 (Tokens are Capital) because multi-LLM sessions cost 3-5x a single query.

Decision

Add Module M3 — Deliberation Chamber to the Synth Nova roadmap. M3 provides structured multi-LLM deliberation with arbitrated synthesis.

Core commitments:

v1 panel: Claude Sonnet 4, GPT-4, Gemini Pro. Three independent providers for minimum viable quorum.
Arbiter: Claude Sonnet 4 (shared with existing Judge agent in synth-brain). Conflict-of-interest mitigations defined in MultiLLMDeliberationPolicy; arbiter evaluates methodology and evidence, not content authorship.
Trigger model: pipeline agents or founder may propose; founder must explicitly approve each session (Constitution Law 5). System never auto-delegates.
Two-phase deliberation: independent Phase 1 responses, then optional one-round cross-examination if divergence detected. No infinite loops.
Honest uncertainty: arbiter may return “no consensus — founder review recommended” as a valid outcome. Forced synthesis when panelists genuinely disagree is prohibited.
Strategic document and operational policy ratified now; implementation sprint parallel to M2 or after M2 based on capacity.

Alternatives Considered

Option A: Continue manual courier workflow

Pros: zero infrastructure cost; fully under founder control.
Cons: slow, error-prone, no audit trail, inconsistent (founder may skip LLMs). Doesn’t scale as pipeline matures.

Option B: Rely on single LLM (Claude) for all strategic questions

Pros: simplest; cheapest; no new integrations.
Cons: single-model bias risk; reduces validation; violates spirit of Constitution Law 7 (Verify). A single model’s blind spot becomes a systemic blind spot.

Option C: Ensemble voting (majority wins)

Pros: simple aggregation; no arbiter cost.
Cons: epistemically weak. Averaging or majority-voting LLM outputs loses the signal — minority view may be correct, and reasoning quality matters more than vote count.

Option D: Use external service (e.g., Consensus, LMSYS)

Pros: no build cost.
Cons: external dependency; per-call cost; lack of customization for Synth Nova integration points (CEO/Director/Judge pipeline hooks); audit trail lives outside our observability stack.

Option E: Structured deliberation with arbitrated synthesis ← chosen

Pros: preserves verbatim panelist responses (audit trail); arbiter evaluates evidence quality, not vote count; honest-uncertainty outcomes allowed; integrates with M1/M2 pipelines; human-gated per Law 5.
Cons: 3-5x cost vs single LLM; arbiter conflict-of-interest (Claude-as-both-panelist-and-arbiter) requires explicit mitigation; additional API account setup (OpenAI, Google).
Why chosen: directly replaces manual courier work, operationalizes Law 7 for strategic decisions, and establishes an auditable multi-model consultation primitive that M1 and M2 can invoke.

Consequences

Positive:

New strategic asset in manifest: 07-Roadmap/Deliberation-Chamber-Module.md.
New operational policy: 05-Rules/MultiLLMDeliberationPolicy.md.
Multi-source validation primitive available to M1 (Research confidence < threshold, Financial Modeler contradictions, Judge FAIL) and M2 (ambiguous team-fit signals).
Audit trail per Law 6: verbatim panelist responses, arbiter reasoning, founder’s subsequent action all preserved.
Potential evolution path: M3 can become external product feature (v2+) if internal use demonstrates value.

Negative / Trade-offs:

New infrastructure: adapters for GPT-4 (OpenAI API) and Gemini (Google API) — new API keys, new cost tracking, new failure modes.
Per-session cost ~ $0.30 - 1.20, p ro j ec t e d m o n t h l yc a p$ 100 — non-zero recurring expense.
Arbiter conflict-of-interest requires active monitoring (if Claude-panelist view wins >50% of arbitrated cases, re-evaluate arbiter choice).
Founder approval friction per session (intentional per Law 5, but real cost in flow).

Mitigations:

Rate limits and cost caps defined in MultiLLMDeliberationPolicy (10 sessions/day, $3/ sess i o nha r d s t o p,$ 100/month budget).
Conflict-of-interest mitigations (explicit arbiter system prompt, self-check, periodic founder review).
Integration Triage Policy (IntegrationTriagePolicy) governs any v2+ panelist additions (Grok, DeepSeek, Perplexity, Mistral, Llama).
Honest-uncertainty mandate prevents false-consensus outputs.

Follow-ups

OpenAI API account setup (billing separate from Anthropic).
Google Gemini API account setup.
Detailed agent specs during M3 implementation sprint (panelist adapters, divergence detector, arbiter orchestrator).
Arbiter calibration process (founder reviews first 10-20 sessions for arbitration quality).
CLI-only v1 vs Streamlit integration decision.
Per-agent ADR for each panelist adapter during implementation.
Review: if Claude-panelist view wins >50% of arbitrated cases, re-evaluate arbiter choice (avoid Claude echo chamber).
Integration hooks spec for M1 (Research, Financial Modeler, Judge) and M2 (Team Assessment) triggers.

References

Constitution — supreme governance (Laws 5, 6, 7, 8 directly apply).
Deliberation-Chamber-Module — M3 strategic document (07-Roadmap).
MultiLLMDeliberationPolicy — operational rules for M3 (05-Rules).
Niche-Evaluation-Module — M1 strategic document; M3 may be invoked from M1 pipeline.
Team-Implementation-Module — M2 strategic document; M3 may be invoked from M2 pipeline.
ADR-0011-integration-triage-policy — governs v2+ LLM provider additions.
ADR-0012-constitution — Constitutional foundation.
ADR-0013-m2-team-implementation-navigator — prior module in the M-series.
ObservabilityContract — retention of Chamber transcripts.
DecisionRights — approval escalation rules for per-session cost.
ConflictResolution — Judge agent handles intra-system conflicts; M3 handles external-knowledge conflicts.
Decision-Log — ADR index.

Synth Nova Manifest

Explorer

ADR-0014: M3 Deliberation Chamber

ADR-0014: M3 Deliberation Chamber

Status

Context

Decision

Alternatives Considered

Option A: Continue manual courier workflow

Option B: Rely on single LLM (Claude) for all strategic questions

Option C: Ensemble voting (majority wins)

Option D: Use external service (e.g., Consensus, LMSYS)

Option E: Structured deliberation with arbitrated synthesis ← chosen

Consequences

Follow-ups

References

Graph View

Table of Contents

Backlinks