Agent-Intake Implementation Spec

Purpose: Detailed execution plan for Claude Code to implement Agent-Intake — new agent that structures raw user queries into a canonical brief format before CEO delegates to Intel Director. Author: Co-Developer Claude (Opus 4.7), 2026-04-20 Status: Ready to execute, pending Denis approval Estimated effort: 1-2 days (6-10 hours Claude Code work, can split across 2 sessions) Expected impact: Cleaner briefs → better downstream agent output. Primary value: fixes the “Director missed half the brief” issue (R26 Director 4.0/10) at source.


Why this matters

Current flow:

User query (free text) → Streamlit UI → CEO → Intel Director → ... pipeline

Problem: User briefs vary wildly in structure. Some are one-line “анализ ниши X”. Some are 500-word multi-section ТЗ. Some are voice memo transcripts with filler words. CEO and Intel Director interpret ambiguous briefs inconsistently.

R26 Judge feedback directly validates this: “Brief requests avatars, CJM, funnel analysis, unit-economics, ROI by channel, tech stack, risk matrix — none addressed in summary” — Director missed explicit asks because the brief was a wall of text without clear section markers.

Fix the problem at source, not symptoms: Instead of fixing Director’s summary skill (R26 brief-coverage checklist already does), also structure the brief before it reaches CEO. Give Director a parsed, structured object with explicit fields.

Architecture

New flow

User query → Agent-Intake → Structured Brief (JSON) → CEO → Intel Director → pipeline

Agent-Intake responsibilities

  1. Parse free-text query into structured fields
  2. Classify intent: niche_analysis / team_assessment / strategic_question / other
  3. Extract explicit requirements (list of deliverables user asked for)
  4. Extract implicit context (target audience, budget constraints, geography, monetization model)
  5. Flag ambiguities: what’s unclear, what needs clarification
  6. Produce canonical StructuredBrief object for downstream agents

Position in pipeline

  • After: User submits query via Streamlit
  • Before: CEO receives delegation
  • Sibling to: Nothing — sits in orchestration layer as first agent

Cost

  • Single Sonnet 4.5 call per user query
  • Input: user query (usually < 500 tokens)
  • Output: structured JSON (~300-500 tokens)
  • Estimated cost: $0.005 per call — negligible

Latency

  • Sonnet 4.5 call ~5-10 seconds
  • Added to total pipeline time (currently 24 min). Negligible relative impact.

Data model

StructuredBrief (Pydantic model)

from pydantic import BaseModel, Field
from typing import Literal, Optional
from datetime import datetime
 
class DeliverableRequirement(BaseModel):
    """A specific deliverable the user asked for."""
    name: str  # e.g. "market_sizing", "unit_economics", "customer_journey_map"
    raw_phrase: str  # how user phrased it: "финансовая модель с unit-экономикой"
    priority: Literal["must_have", "nice_to_have"] = "must_have"
 
class Ambiguity(BaseModel):
    """Something unclear in the brief."""
    topic: str  # e.g. "target_geography"
    what_is_unclear: str  # "User mentioned 'global' but also 'Russian-speaking audience'"
    suggested_default: Optional[str] = None  # Best guess if no clarification
 
class StructuredBrief(BaseModel):
    """Canonical brief format for downstream agents."""
    # Raw input
    raw_query: str
    received_at: datetime
    
    # Classification
    intent: Literal["niche_analysis", "team_assessment", "strategic_question", "follow_up", "other"]
    intent_confidence: float = Field(ge=0.0, le=1.0)
    
    # Extracted fields (nullable — not all briefs have all fields)
    topic: str  # short identifier: "AI esoterics", "European fintech", "mobile games Brazil"
    target_audience: Optional[str] = None
    geography: Optional[str] = None  # "RU", "global", "EU + NA"
    monetization_model: Optional[str] = None  # "subscription $5-15/$15-50/$50+"
    budget_constraints: Optional[str] = None
    timeline: Optional[str] = None
    
    # Deliverables
    deliverables: list[DeliverableRequirement]
    
    # Gaps
    ambiguities: list[Ambiguity]
    
    # Meta
    intake_cost_usd: Optional[float] = None
    intake_model: Optional[str] = None
    intake_duration_s: Optional[float] = None

Canonical deliverable names

To keep downstream consistent, Agent-Intake normalizes user phrasing to canonical names:

User might sayCanonical name
”анализ рынка”, “market analysis”, “размер рынка”market_sizing
”юнит-экономика”, “unit economics”, “LTV CAC”unit_economics
”CJM”, “путь клиента”, “customer journey”customer_journey_map
”воронки”, “sales funnels”, “DM funnels”sales_funnels
”аватары”, “avatars”, “портреты ЦА”audience_avatars
”tech stack”, “технологии”, “стэк”tech_stack
”риски”, “risk matrix”risk_analysis
”ROI”, “ROI по каналам”roi_analysis
”go/no-go”, “решение”, “recommendation”go_no_go_recommendation
(+ ~15 more mappings)

This enables downstream agents to check brief.deliverables by canonical name instead of parsing free text.


Implementation

File 1: Agent-Intake module

Path: src/synth_brain/agents/agent_intake.py

"""Agent-Intake — structures raw user queries into canonical briefs."""
import json
import os
from datetime import datetime, timezone
from typing import Any
 
from pydantic import BaseModel
# ... imports for StructuredBrief, DeliverableRequirement, Ambiguity
 
from synth_brain.llm.providers.anthropic import AnthropicProvider
from synth_brain.llm.providers.openai import OpenAIProvider
 
INTAKE_SYSTEM_PROMPT = """You are Agent-Intake. You receive a free-text user query and produce a structured brief that downstream agents can work with.
 
Your job is to:
1. Classify intent (niche_analysis / team_assessment / strategic_question / follow_up / other)
2. Extract the topic, target audience, geography, monetization model, budget, timeline (any that are stated)
3. List all deliverables the user explicitly requested
4. Flag any ambiguities that need clarification
 
## Canonical deliverable names
 
Map user phrasing to these canonical names:
- market_sizing — TAM/SAM/SOM, market size, рынок
- unit_economics — юнит-экономика, LTV, CAC, contribution margin
- customer_journey_map — CJM, путь клиента
- sales_funnels — воронки продаж, DM funnels, no-DM funnels
- audience_avatars — аватары, портреты ЦА, segments
- audience_segmentation — сегментация (if distinct from avatars)
- tech_stack — tech stack, технологии
- risk_analysis — риски, risk matrix
- roi_analysis — ROI, unit economics by channel
- financial_model — финансовая модель, P&L, projections
- competitor_landscape — конкуренты, competitor analysis
- content_strategy — контент-стратегия, video formats
- sales_scripts — sales scripts, call scripts
- go_no_go_recommendation — go/no-go, решение, verdict
- mvp_scope — MVP scope, минимальный продукт
- legal_considerations — юридические вопросы, legal, compliance
 
If user requests something NOT in this list, create a custom canonical name in snake_case and include it — downstream will handle as best-effort.
 
## Output format
 
Respond ONLY with valid JSON matching this schema:
 
{
  "intent": "niche_analysis",
  "intent_confidence": 0.9,
  "topic": "brief topic identifier (2-5 words)",
  "target_audience": "description or null",
  "geography": "RU or global or null",
  "monetization_model": "description or null",
  "budget_constraints": "description or null",
  "timeline": "description or null",
  "deliverables": [
    {"name": "market_sizing", "raw_phrase": "анализ рынка", "priority": "must_have"},
    ...
  ],
  "ambiguities": [
    {"topic": "geography", "what_is_unclear": "...", "suggested_default": "..."}
  ]
}
 
## Rules
 
- Be concise. If information is not stated, use null. Don't guess values.
- Classify confidence honestly. If query is "анализ ниши астрологии" (very short), intent_confidence can still be 0.9 because intent is obvious. If query is ambiguous between strategic question and niche analysis, confidence should be 0.5-0.7.
- List every explicit deliverable. If user says "финансовая модель с unit-экономикой и ROI по каналам", that's THREE deliverables: financial_model, unit_economics, roi_analysis.
- Ambiguities are for things that need human clarification. Don't list preferences or assumptions as ambiguities.
"""
 
def run_intake(
    raw_query: str,
    provider: str | None = None,
    model: str | None = None,
) -> dict[str, Any]:
    """
    Structure a raw query into a canonical brief.
    Returns dict matching StructuredBrief schema.
    Never raises — returns degraded brief with raw_query + intent=other on error.
    """
    provider = provider or os.environ.get("INTAKE_PROVIDER", "anthropic")
    model = model or os.environ.get("INTAKE_MODEL", "claude-sonnet-4-5-20250929")
    
    received_at = datetime.now(timezone.utc)
    
    try:
        if provider == "anthropic":
            client = AnthropicProvider(api_key=os.environ["ANTHROPIC_API_KEY"])
        elif provider == "openai":
            client = OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"])
        else:
            raise ValueError(f"Unknown INTAKE_PROVIDER: {provider}")
        
        response = client.complete(
            model=model,
            system=INTAKE_SYSTEM_PROMPT,
            messages=[{"role": "user", "content": raw_query}],
            max_tokens=2000,
            response_format={"type": "json_object"} if provider == "openai" else None,
        )
        
        parsed = json.loads(response.content)
        parsed["raw_query"] = raw_query
        parsed["received_at"] = received_at.isoformat()
        parsed["intake_cost_usd"] = response.cost_usd
        parsed["intake_model"] = model
        parsed["intake_duration_s"] = response.duration_s
        return parsed
    
    except Exception as e:
        # Degraded brief — don't block pipeline
        return {
            "raw_query": raw_query,
            "received_at": received_at.isoformat(),
            "intent": "other",
            "intent_confidence": 0.1,
            "topic": raw_query[:50],
            "target_audience": None,
            "geography": None,
            "monetization_model": None,
            "budget_constraints": None,
            "timeline": None,
            "deliverables": [],
            "ambiguities": [{"topic": "intake_failed", "what_is_unclear": str(e)[:200], "suggested_default": None}],
            "intake_cost_usd": 0,
            "intake_model": model,
            "intake_duration_s": 0,
        }

File 2: Integration into orchestration

Inspection first: Find where CEO receives the query. Likely in src/synth_brain_ui/run_m1_query.py or src/synth_brain/agents/runner.py.

Pattern: Call run_intake() before run_ceo(). Pass StructuredBrief (as dict) to CEO.

from synth_brain.agents.agent_intake import run_intake
 
# In pipeline initialization:
def run_pipeline(user_query: str, ...):
    # NEW: intake step
    brief = run_intake(user_query)
    
    # Save brief for downstream reference + auditability
    brief_path = output_dir / "brief.json"
    brief_path.write_text(json.dumps(brief, indent=2, ensure_ascii=False))
    
    # CEO receives enriched input
    ceo_result = run_ceo(
        raw_query=user_query,  # keep original for CEO context
        structured_brief=brief,  # new: structured version
    )
    # ... rest of pipeline

File 3: CEO + Director updates

CEO and Intel Director should now receive structured_brief and use it:

CEO prompt addition:

You will receive both the raw user query and a pre-structured brief. Use the structured brief for:
- Knowing what specific deliverables the user asked for (field: deliverables)
- Understanding target audience and geography (fields: target_audience, geography)
- Flagging any ambiguities to the user (field: ambiguities — if non-empty, consider asking for clarification before delegating)

If intent is not "niche_analysis", respond to user explaining what you CAN help with (currently only niche analysis pipeline is wired).

Director prompt addition (Brief-Coverage Checklist we added earlier gets stronger):

You will receive the pre-structured brief with explicit deliverables list. Your Executive Summary MUST address every item in `brief.deliverables`. Check the list before writing summary.

File 4: Tests

Path: tests/test_agent_intake.py

"""Tests for Agent-Intake."""
import pytest
import json
from unittest.mock import MagicMock, patch
 
from synth_brain.agents.agent_intake import run_intake
 
def test_intake_handles_simple_query():
    """Short query → valid minimal brief."""
    mock_response = {
        "intent": "niche_analysis",
        "intent_confidence": 0.95,
        "topic": "astrology apps",
        "target_audience": None,
        "geography": None,
        "monetization_model": None,
        "budget_constraints": None,
        "timeline": None,
        "deliverables": [],
        "ambiguities": [
            {"topic": "deliverables", "what_is_unclear": "User did not specify what analysis they want", "suggested_default": "full niche analysis"}
        ]
    }
    
    with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
        mock_client = MagicMock()
        mock_client.complete.return_value = MagicMock(
            content=json.dumps(mock_response),
            cost_usd=0.005,
            duration_s=3.2,
        )
        mock_provider.return_value = mock_client
        
        result = run_intake("анализ ниши астрологии")
        
        assert result["intent"] == "niche_analysis"
        assert result["topic"] == "astrology apps"
        assert "raw_query" in result
        assert result["raw_query"] == "анализ ниши астрологии"
 
def test_intake_extracts_multiple_deliverables():
    """Query with multiple asks → deliverables list populated."""
    mock_response = {
        "intent": "niche_analysis",
        "intent_confidence": 0.98,
        "topic": "AI esoterics",
        "target_audience": "женщины 25-45",
        "geography": None,
        "monetization_model": "subscription tiers",
        "budget_constraints": None,
        "timeline": None,
        "deliverables": [
            {"name": "market_sizing", "raw_phrase": "анализ ниши", "priority": "must_have"},
            {"name": "audience_avatars", "raw_phrase": "сегментация с аватарами", "priority": "must_have"},
            {"name": "customer_journey_map", "raw_phrase": "CJM", "priority": "must_have"},
            {"name": "sales_funnels", "raw_phrase": "воронки продаж", "priority": "must_have"},
            {"name": "financial_model", "raw_phrase": "финансовая модель", "priority": "must_have"},
            {"name": "unit_economics", "raw_phrase": "unit-экономикой", "priority": "must_have"},
            {"name": "roi_analysis", "raw_phrase": "ROI по каналам", "priority": "must_have"},
        ],
        "ambiguities": []
    }
    
    with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
        mock_client = MagicMock()
        mock_client.complete.return_value = MagicMock(
            content=json.dumps(mock_response),
            cost_usd=0.007,
            duration_s=5.1,
        )
        mock_provider.return_value = mock_client
        
        long_query = "Анализ ниши AI-эзотерика. ЦА: женщины 25-45. Монетизация: подписка. Нужны: сегментация с аватарами, CJM, воронки продаж, финансовая модель с unit-экономикой, ROI по каналам."
        result = run_intake(long_query)
        
        assert len(result["deliverables"]) == 7
        assert any(d["name"] == "unit_economics" for d in result["deliverables"])
 
def test_intake_returns_degraded_on_api_error():
    """API error → degraded brief, never raises."""
    with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
        mock_client = MagicMock()
        mock_client.complete.side_effect = Exception("API down")
        mock_provider.return_value = mock_client
        
        result = run_intake("test query")
        
        assert result["intent"] == "other"
        assert result["intent_confidence"] == 0.1
        assert result["raw_query"] == "test query"
        assert len(result["ambiguities"]) == 1
        assert "intake_failed" in result["ambiguities"][0]["topic"]
 
def test_intake_degraded_on_bad_json():
    """Bad JSON from LLM → degraded brief."""
    with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
        mock_client = MagicMock()
        mock_client.complete.return_value = MagicMock(
            content="not valid json {{{",
            cost_usd=0.005,
            duration_s=3.0,
        )
        mock_provider.return_value = mock_client
        
        result = run_intake("test")
        assert result["intent"] == "other"
 
def test_intake_supports_openai_provider():
    """INTAKE_PROVIDER=openai uses OpenAIProvider."""
    with patch("synth_brain.agents.agent_intake.OpenAIProvider") as mock_provider:
        mock_client = MagicMock()
        mock_client.complete.return_value = MagicMock(
            content=json.dumps({
                "intent": "niche_analysis",
                "intent_confidence": 0.9,
                "topic": "test",
                "deliverables": [],
                "ambiguities": []
            }),
            cost_usd=0.003,
            duration_s=2.5,
        )
        mock_provider.return_value = mock_client
        
        import os
        with patch.dict(os.environ, {"INTAKE_PROVIDER": "openai"}):
            result = run_intake("test query")
            assert result["intent"] == "niche_analysis"
 
def test_intake_canonical_deliverable_names():
    """Verify known canonical names are used consistently."""
    # This is a smoke test against the canonical list
    known = {
        "market_sizing", "unit_economics", "customer_journey_map",
        "sales_funnels", "audience_avatars", "audience_segmentation",
        "tech_stack", "risk_analysis", "roi_analysis", "financial_model",
        "competitor_landscape", "content_strategy", "sales_scripts",
        "go_no_go_recommendation", "mvp_scope", "legal_considerations",
    }
    # Read INTAKE_SYSTEM_PROMPT and verify all canonical names appear
    from synth_brain.agents.agent_intake import INTAKE_SYSTEM_PROMPT
    for name in known:
        assert name in INTAKE_SYSTEM_PROMPT, f"Canonical name {name} missing from prompt"

File 5: Streamlit UI update (optional for this sprint)

Add to Streamlit — after pipeline runs, display the structured brief alongside the report:

# In reports view
if brief_path.exists():
    with st.expander("📋 Structured Brief (Agent-Intake output)"):
        brief = json.loads(brief_path.read_text())
        st.json(brief)

Status: optional. If time permits. Low risk.


Smoke test plan

Phase 1 — Unit tests

.venv/bin/python -m pytest tests/test_agent_intake.py -v
.venv/bin/python -m pytest tests/ 2>&1 | tail -15

Expected: 6+ new tests pass, 543+ total, 0 failed.

Phase 2 — REPL smoke test

from synth_brain.agents.agent_intake import run_intake
import json
 
# Test with R26 actual ТЗ
r26_query = """Техническое задание: проект AI-эзотерика. Провести анализ ниши эзотерики (таро, астрология, прогнозы) как AI-powered продукт. Целевая аудитория: женщины 25-45, интерес к саморазвитию. Модель монетизации: подписка с тремя тирами ($5-15, $15-50, $50-150+), сегментация целевой аудитории с аватарами, CJM, воронки продаж (DM + без DM), финансовая модель с unit-экономикой и ROI по каналам, MVP scope, tech stack, риски и go/no-go рекомендация."""
 
result = run_intake(r26_query)
print(json.dumps(result, indent=2, ensure_ascii=False))

Expected:

  • intent: niche_analysis with confidence 0.95+
  • topic: short phrase about AI esoterics
  • target_audience: “женщины 25-45”
  • monetization_model: subscription tiers mentioned
  • deliverables: list of 8-10 items including market_sizing, audience_avatars, customer_journey_map, sales_funnels, financial_model, unit_economics, roi_analysis, mvp_scope, tech_stack, risk_analysis, go_no_go_recommendation
  • ambiguities: possibly empty or 1-2 small items

Phase 3 — Integration test

Only after Phase 1+2 pass: run full M1 pipeline with intake step included. Verify:

  • brief.json exists in run artifacts
  • Pipeline completes successfully
  • Director’s executive summary addresses every item in brief.deliverables

This is manual validation by Denis via Streamlit, not automated.


Success criteria

  • All existing tests still pass (537+ baseline)
  • 6+ new tests in test_agent_intake.py pass
  • REPL smoke test on R26 ТЗ produces 8+ correctly-named deliverables
  • brief.json saved alongside other run artifacts
  • CEO receives StructuredBrief alongside raw_query
  • Intel Director’s _aggregate_core has access to structured deliverables list
  • Degraded mode works: if intake fails, pipeline continues with basic brief
  • No regression in pipeline timing (< 15s added to 24min pipeline)
  • Commit pushed

Risks and mitigations

RiskLikelihoodMitigation
Intake fails on weird queries (voice transcripts with filler words)MediumDegraded-mode fallback — pipeline continues with raw_query only
Adds latency to pipelineLowSonnet single call 5-10s vs 24min pipeline — negligible
Canonical name list misses user conceptsMediumCustom snake_case fallback — user-requested but novel deliverables still captured
CEO prompt needs significant rewriteMediumKeep CEO backward-compatible: if structured_brief is None, work with raw_query as before
Cost adds upLow0.50/month
Intake introduces wrong intent classification, blocking niche analysisLow-MediumCEO ultimate decides what to do with intent — intake is advisory, not gating

Rollback plan

Option 1 — Feature flag off:

export INTAKE_DISABLED=1

Pipeline routes directly from user query to CEO, skipping intake. Runner logic must check flag.

Option 2 — Revert integration commit: Leaves agent_intake.py as dead code for later revival. CEO reverts to raw_query only.


Deliverables checklist

  • src/synth_brain/agents/agent_intake.py with run_intake() + INTAKE_SYSTEM_PROMPT
  • tests/test_agent_intake.py with 6+ tests
  • Integration into pipeline orchestration (call intake before CEO)
  • brief.json saved to run output directory
  • CEO prompt updated to receive structured_brief
  • Intel Director prompt references brief.deliverables in Brief-Coverage Checklist
  • INTAKE_DISABLED feature flag wired
  • REPL smoke test on R26 query passes
  • All tests passing (543+)
  • Commit: “feat: Agent-Intake for structured brief extraction (closes Director missed-requirements gap)”
  • Update manifest: 03-Roles/Agent-Intake.md status proposed → shipped

Handoff to Claude Code

Phase 1 — Code + tests (this session)

Task: implement Phase 1 of Agent-Intake per spec at /home/developer/projects/manifest/07-Roadmap/Agent-Intake-Spec.md

Read spec first.

Inspection first:
1. grep for existing LLM provider abstraction: src/synth_brain/llm/providers/
2. Find where CEO is invoked in pipeline: grep -rn "run_ceo\|CEOAgent" src/
3. Find where user_query enters pipeline: likely src/synth_brain_ui/run_m1_query.py

Phase 1 deliverables:
1. Create src/synth_brain/agents/agent_intake.py with:
   - INTAKE_SYSTEM_PROMPT constant
   - run_intake() function with provider abstraction (anthropic default, openai via env)
   - Degraded-mode fallback on errors
2. Create tests/test_agent_intake.py with 6+ tests
3. Verify: .venv/bin/python -m pytest tests/ 2>&1 | tail -15 (expect 543+ passed, 0 failed)
4. REPL smoke test:
   - Call run_intake with R26 ТЗ (from spec)
   - Verify result has intent=niche_analysis, 8+ deliverables including unit_economics, customer_journey_map, audience_avatars
5. Only if REPL + tests pass: commit "feat: Agent-Intake module - structured brief extraction"

Do NOT integrate into pipeline yet (Phase 2). Do NOT push.

Report back:
- Inspection findings (provider path, CEO location)
- Test output
- Commit hash
- REPL smoke test: intent, topic, number of deliverables, first 3 deliverable names

Phase 2 — Integration (next session)

Task: integrate Agent-Intake into M1 pipeline per spec Phase 2.

Changes:
1. Modify orchestration (run_m1_query.py or runner.py) to call run_intake before CEO
2. Save brief.json to run output directory
3. Update CEO to accept structured_brief parameter
4. Update Intel Director prompt to reference brief.deliverables
5. Add INTAKE_DISABLED env flag support
6. Run test suite
7. Commit "feat: integrate Agent-Intake into M1 pipeline"

Do NOT run M1 pipeline.

Additional considerations

When NOT to use Agent-Intake

  • Direct API calls to M1 with already-structured input (future programmatic clients)
  • Internal Claude-to-Claude calls for sub-research
  • Follow-up queries in same session (brief already exists)

Future extensions

  • Multi-turn clarification — if ambiguities are non-empty, ask user before running expensive pipeline
  • Template briefs — user can select “niche analysis template” that pre-fills brief structure
  • Brief versioning — if user runs same niche twice, compare briefs, note what changed

These are NOT in scope for this sprint. Parked.


Estimated duration:

  • Phase 1 (code + tests): 3-4 hours
  • Phase 2 (integration): 2-3 hours Total: 5-7 hours across 1-2 Claude Code sessions.

Good luck.