Agent-Intake Implementation Spec
Purpose: Detailed execution plan for Claude Code to implement Agent-Intake — new agent that structures raw user queries into a canonical brief format before CEO delegates to Intel Director. Author: Co-Developer Claude (Opus 4.7), 2026-04-20 Status: Ready to execute, pending Denis approval Estimated effort: 1-2 days (6-10 hours Claude Code work, can split across 2 sessions) Expected impact: Cleaner briefs → better downstream agent output. Primary value: fixes the “Director missed half the brief” issue (R26 Director 4.0/10) at source.
Why this matters
Current flow:
User query (free text) → Streamlit UI → CEO → Intel Director → ... pipeline
Problem: User briefs vary wildly in structure. Some are one-line “анализ ниши X”. Some are 500-word multi-section ТЗ. Some are voice memo transcripts with filler words. CEO and Intel Director interpret ambiguous briefs inconsistently.
R26 Judge feedback directly validates this: “Brief requests avatars, CJM, funnel analysis, unit-economics, ROI by channel, tech stack, risk matrix — none addressed in summary” — Director missed explicit asks because the brief was a wall of text without clear section markers.
Fix the problem at source, not symptoms: Instead of fixing Director’s summary skill (R26 brief-coverage checklist already does), also structure the brief before it reaches CEO. Give Director a parsed, structured object with explicit fields.
Architecture
New flow
User query → Agent-Intake → Structured Brief (JSON) → CEO → Intel Director → pipeline
Agent-Intake responsibilities
- Parse free-text query into structured fields
- Classify intent:
niche_analysis/team_assessment/strategic_question/other - Extract explicit requirements (list of deliverables user asked for)
- Extract implicit context (target audience, budget constraints, geography, monetization model)
- Flag ambiguities: what’s unclear, what needs clarification
- Produce canonical
StructuredBriefobject for downstream agents
Position in pipeline
- After: User submits query via Streamlit
- Before: CEO receives delegation
- Sibling to: Nothing — sits in orchestration layer as first agent
Cost
- Single Sonnet 4.5 call per user query
- Input: user query (usually < 500 tokens)
- Output: structured JSON (~300-500 tokens)
- Estimated cost: $0.005 per call — negligible
Latency
- Sonnet 4.5 call ~5-10 seconds
- Added to total pipeline time (currently 24 min). Negligible relative impact.
Data model
StructuredBrief (Pydantic model)
from pydantic import BaseModel, Field
from typing import Literal, Optional
from datetime import datetime
class DeliverableRequirement(BaseModel):
"""A specific deliverable the user asked for."""
name: str # e.g. "market_sizing", "unit_economics", "customer_journey_map"
raw_phrase: str # how user phrased it: "финансовая модель с unit-экономикой"
priority: Literal["must_have", "nice_to_have"] = "must_have"
class Ambiguity(BaseModel):
"""Something unclear in the brief."""
topic: str # e.g. "target_geography"
what_is_unclear: str # "User mentioned 'global' but also 'Russian-speaking audience'"
suggested_default: Optional[str] = None # Best guess if no clarification
class StructuredBrief(BaseModel):
"""Canonical brief format for downstream agents."""
# Raw input
raw_query: str
received_at: datetime
# Classification
intent: Literal["niche_analysis", "team_assessment", "strategic_question", "follow_up", "other"]
intent_confidence: float = Field(ge=0.0, le=1.0)
# Extracted fields (nullable — not all briefs have all fields)
topic: str # short identifier: "AI esoterics", "European fintech", "mobile games Brazil"
target_audience: Optional[str] = None
geography: Optional[str] = None # "RU", "global", "EU + NA"
monetization_model: Optional[str] = None # "subscription $5-15/$15-50/$50+"
budget_constraints: Optional[str] = None
timeline: Optional[str] = None
# Deliverables
deliverables: list[DeliverableRequirement]
# Gaps
ambiguities: list[Ambiguity]
# Meta
intake_cost_usd: Optional[float] = None
intake_model: Optional[str] = None
intake_duration_s: Optional[float] = NoneCanonical deliverable names
To keep downstream consistent, Agent-Intake normalizes user phrasing to canonical names:
| User might say | Canonical name |
|---|---|
| ”анализ рынка”, “market analysis”, “размер рынка” | market_sizing |
| ”юнит-экономика”, “unit economics”, “LTV CAC” | unit_economics |
| ”CJM”, “путь клиента”, “customer journey” | customer_journey_map |
| ”воронки”, “sales funnels”, “DM funnels” | sales_funnels |
| ”аватары”, “avatars”, “портреты ЦА” | audience_avatars |
| ”tech stack”, “технологии”, “стэк” | tech_stack |
| ”риски”, “risk matrix” | risk_analysis |
| ”ROI”, “ROI по каналам” | roi_analysis |
| ”go/no-go”, “решение”, “recommendation” | go_no_go_recommendation |
| (+ ~15 more mappings) |
This enables downstream agents to check brief.deliverables by canonical name instead of parsing free text.
Implementation
File 1: Agent-Intake module
Path: src/synth_brain/agents/agent_intake.py
"""Agent-Intake — structures raw user queries into canonical briefs."""
import json
import os
from datetime import datetime, timezone
from typing import Any
from pydantic import BaseModel
# ... imports for StructuredBrief, DeliverableRequirement, Ambiguity
from synth_brain.llm.providers.anthropic import AnthropicProvider
from synth_brain.llm.providers.openai import OpenAIProvider
INTAKE_SYSTEM_PROMPT = """You are Agent-Intake. You receive a free-text user query and produce a structured brief that downstream agents can work with.
Your job is to:
1. Classify intent (niche_analysis / team_assessment / strategic_question / follow_up / other)
2. Extract the topic, target audience, geography, monetization model, budget, timeline (any that are stated)
3. List all deliverables the user explicitly requested
4. Flag any ambiguities that need clarification
## Canonical deliverable names
Map user phrasing to these canonical names:
- market_sizing — TAM/SAM/SOM, market size, рынок
- unit_economics — юнит-экономика, LTV, CAC, contribution margin
- customer_journey_map — CJM, путь клиента
- sales_funnels — воронки продаж, DM funnels, no-DM funnels
- audience_avatars — аватары, портреты ЦА, segments
- audience_segmentation — сегментация (if distinct from avatars)
- tech_stack — tech stack, технологии
- risk_analysis — риски, risk matrix
- roi_analysis — ROI, unit economics by channel
- financial_model — финансовая модель, P&L, projections
- competitor_landscape — конкуренты, competitor analysis
- content_strategy — контент-стратегия, video formats
- sales_scripts — sales scripts, call scripts
- go_no_go_recommendation — go/no-go, решение, verdict
- mvp_scope — MVP scope, минимальный продукт
- legal_considerations — юридические вопросы, legal, compliance
If user requests something NOT in this list, create a custom canonical name in snake_case and include it — downstream will handle as best-effort.
## Output format
Respond ONLY with valid JSON matching this schema:
{
"intent": "niche_analysis",
"intent_confidence": 0.9,
"topic": "brief topic identifier (2-5 words)",
"target_audience": "description or null",
"geography": "RU or global or null",
"monetization_model": "description or null",
"budget_constraints": "description or null",
"timeline": "description or null",
"deliverables": [
{"name": "market_sizing", "raw_phrase": "анализ рынка", "priority": "must_have"},
...
],
"ambiguities": [
{"topic": "geography", "what_is_unclear": "...", "suggested_default": "..."}
]
}
## Rules
- Be concise. If information is not stated, use null. Don't guess values.
- Classify confidence honestly. If query is "анализ ниши астрологии" (very short), intent_confidence can still be 0.9 because intent is obvious. If query is ambiguous between strategic question and niche analysis, confidence should be 0.5-0.7.
- List every explicit deliverable. If user says "финансовая модель с unit-экономикой и ROI по каналам", that's THREE deliverables: financial_model, unit_economics, roi_analysis.
- Ambiguities are for things that need human clarification. Don't list preferences or assumptions as ambiguities.
"""
def run_intake(
raw_query: str,
provider: str | None = None,
model: str | None = None,
) -> dict[str, Any]:
"""
Structure a raw query into a canonical brief.
Returns dict matching StructuredBrief schema.
Never raises — returns degraded brief with raw_query + intent=other on error.
"""
provider = provider or os.environ.get("INTAKE_PROVIDER", "anthropic")
model = model or os.environ.get("INTAKE_MODEL", "claude-sonnet-4-5-20250929")
received_at = datetime.now(timezone.utc)
try:
if provider == "anthropic":
client = AnthropicProvider(api_key=os.environ["ANTHROPIC_API_KEY"])
elif provider == "openai":
client = OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"])
else:
raise ValueError(f"Unknown INTAKE_PROVIDER: {provider}")
response = client.complete(
model=model,
system=INTAKE_SYSTEM_PROMPT,
messages=[{"role": "user", "content": raw_query}],
max_tokens=2000,
response_format={"type": "json_object"} if provider == "openai" else None,
)
parsed = json.loads(response.content)
parsed["raw_query"] = raw_query
parsed["received_at"] = received_at.isoformat()
parsed["intake_cost_usd"] = response.cost_usd
parsed["intake_model"] = model
parsed["intake_duration_s"] = response.duration_s
return parsed
except Exception as e:
# Degraded brief — don't block pipeline
return {
"raw_query": raw_query,
"received_at": received_at.isoformat(),
"intent": "other",
"intent_confidence": 0.1,
"topic": raw_query[:50],
"target_audience": None,
"geography": None,
"monetization_model": None,
"budget_constraints": None,
"timeline": None,
"deliverables": [],
"ambiguities": [{"topic": "intake_failed", "what_is_unclear": str(e)[:200], "suggested_default": None}],
"intake_cost_usd": 0,
"intake_model": model,
"intake_duration_s": 0,
}File 2: Integration into orchestration
Inspection first: Find where CEO receives the query. Likely in src/synth_brain_ui/run_m1_query.py or src/synth_brain/agents/runner.py.
Pattern: Call run_intake() before run_ceo(). Pass StructuredBrief (as dict) to CEO.
from synth_brain.agents.agent_intake import run_intake
# In pipeline initialization:
def run_pipeline(user_query: str, ...):
# NEW: intake step
brief = run_intake(user_query)
# Save brief for downstream reference + auditability
brief_path = output_dir / "brief.json"
brief_path.write_text(json.dumps(brief, indent=2, ensure_ascii=False))
# CEO receives enriched input
ceo_result = run_ceo(
raw_query=user_query, # keep original for CEO context
structured_brief=brief, # new: structured version
)
# ... rest of pipelineFile 3: CEO + Director updates
CEO and Intel Director should now receive structured_brief and use it:
CEO prompt addition:
You will receive both the raw user query and a pre-structured brief. Use the structured brief for:
- Knowing what specific deliverables the user asked for (field: deliverables)
- Understanding target audience and geography (fields: target_audience, geography)
- Flagging any ambiguities to the user (field: ambiguities — if non-empty, consider asking for clarification before delegating)
If intent is not "niche_analysis", respond to user explaining what you CAN help with (currently only niche analysis pipeline is wired).
Director prompt addition (Brief-Coverage Checklist we added earlier gets stronger):
You will receive the pre-structured brief with explicit deliverables list. Your Executive Summary MUST address every item in `brief.deliverables`. Check the list before writing summary.
File 4: Tests
Path: tests/test_agent_intake.py
"""Tests for Agent-Intake."""
import pytest
import json
from unittest.mock import MagicMock, patch
from synth_brain.agents.agent_intake import run_intake
def test_intake_handles_simple_query():
"""Short query → valid minimal brief."""
mock_response = {
"intent": "niche_analysis",
"intent_confidence": 0.95,
"topic": "astrology apps",
"target_audience": None,
"geography": None,
"monetization_model": None,
"budget_constraints": None,
"timeline": None,
"deliverables": [],
"ambiguities": [
{"topic": "deliverables", "what_is_unclear": "User did not specify what analysis they want", "suggested_default": "full niche analysis"}
]
}
with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
mock_client = MagicMock()
mock_client.complete.return_value = MagicMock(
content=json.dumps(mock_response),
cost_usd=0.005,
duration_s=3.2,
)
mock_provider.return_value = mock_client
result = run_intake("анализ ниши астрологии")
assert result["intent"] == "niche_analysis"
assert result["topic"] == "astrology apps"
assert "raw_query" in result
assert result["raw_query"] == "анализ ниши астрологии"
def test_intake_extracts_multiple_deliverables():
"""Query with multiple asks → deliverables list populated."""
mock_response = {
"intent": "niche_analysis",
"intent_confidence": 0.98,
"topic": "AI esoterics",
"target_audience": "женщины 25-45",
"geography": None,
"monetization_model": "subscription tiers",
"budget_constraints": None,
"timeline": None,
"deliverables": [
{"name": "market_sizing", "raw_phrase": "анализ ниши", "priority": "must_have"},
{"name": "audience_avatars", "raw_phrase": "сегментация с аватарами", "priority": "must_have"},
{"name": "customer_journey_map", "raw_phrase": "CJM", "priority": "must_have"},
{"name": "sales_funnels", "raw_phrase": "воронки продаж", "priority": "must_have"},
{"name": "financial_model", "raw_phrase": "финансовая модель", "priority": "must_have"},
{"name": "unit_economics", "raw_phrase": "unit-экономикой", "priority": "must_have"},
{"name": "roi_analysis", "raw_phrase": "ROI по каналам", "priority": "must_have"},
],
"ambiguities": []
}
with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
mock_client = MagicMock()
mock_client.complete.return_value = MagicMock(
content=json.dumps(mock_response),
cost_usd=0.007,
duration_s=5.1,
)
mock_provider.return_value = mock_client
long_query = "Анализ ниши AI-эзотерика. ЦА: женщины 25-45. Монетизация: подписка. Нужны: сегментация с аватарами, CJM, воронки продаж, финансовая модель с unit-экономикой, ROI по каналам."
result = run_intake(long_query)
assert len(result["deliverables"]) == 7
assert any(d["name"] == "unit_economics" for d in result["deliverables"])
def test_intake_returns_degraded_on_api_error():
"""API error → degraded brief, never raises."""
with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
mock_client = MagicMock()
mock_client.complete.side_effect = Exception("API down")
mock_provider.return_value = mock_client
result = run_intake("test query")
assert result["intent"] == "other"
assert result["intent_confidence"] == 0.1
assert result["raw_query"] == "test query"
assert len(result["ambiguities"]) == 1
assert "intake_failed" in result["ambiguities"][0]["topic"]
def test_intake_degraded_on_bad_json():
"""Bad JSON from LLM → degraded brief."""
with patch("synth_brain.agents.agent_intake.AnthropicProvider") as mock_provider:
mock_client = MagicMock()
mock_client.complete.return_value = MagicMock(
content="not valid json {{{",
cost_usd=0.005,
duration_s=3.0,
)
mock_provider.return_value = mock_client
result = run_intake("test")
assert result["intent"] == "other"
def test_intake_supports_openai_provider():
"""INTAKE_PROVIDER=openai uses OpenAIProvider."""
with patch("synth_brain.agents.agent_intake.OpenAIProvider") as mock_provider:
mock_client = MagicMock()
mock_client.complete.return_value = MagicMock(
content=json.dumps({
"intent": "niche_analysis",
"intent_confidence": 0.9,
"topic": "test",
"deliverables": [],
"ambiguities": []
}),
cost_usd=0.003,
duration_s=2.5,
)
mock_provider.return_value = mock_client
import os
with patch.dict(os.environ, {"INTAKE_PROVIDER": "openai"}):
result = run_intake("test query")
assert result["intent"] == "niche_analysis"
def test_intake_canonical_deliverable_names():
"""Verify known canonical names are used consistently."""
# This is a smoke test against the canonical list
known = {
"market_sizing", "unit_economics", "customer_journey_map",
"sales_funnels", "audience_avatars", "audience_segmentation",
"tech_stack", "risk_analysis", "roi_analysis", "financial_model",
"competitor_landscape", "content_strategy", "sales_scripts",
"go_no_go_recommendation", "mvp_scope", "legal_considerations",
}
# Read INTAKE_SYSTEM_PROMPT and verify all canonical names appear
from synth_brain.agents.agent_intake import INTAKE_SYSTEM_PROMPT
for name in known:
assert name in INTAKE_SYSTEM_PROMPT, f"Canonical name {name} missing from prompt"File 5: Streamlit UI update (optional for this sprint)
Add to Streamlit — after pipeline runs, display the structured brief alongside the report:
# In reports view
if brief_path.exists():
with st.expander("📋 Structured Brief (Agent-Intake output)"):
brief = json.loads(brief_path.read_text())
st.json(brief)Status: optional. If time permits. Low risk.
Smoke test plan
Phase 1 — Unit tests
.venv/bin/python -m pytest tests/test_agent_intake.py -v
.venv/bin/python -m pytest tests/ 2>&1 | tail -15Expected: 6+ new tests pass, 543+ total, 0 failed.
Phase 2 — REPL smoke test
from synth_brain.agents.agent_intake import run_intake
import json
# Test with R26 actual ТЗ
r26_query = """Техническое задание: проект AI-эзотерика. Провести анализ ниши эзотерики (таро, астрология, прогнозы) как AI-powered продукт. Целевая аудитория: женщины 25-45, интерес к саморазвитию. Модель монетизации: подписка с тремя тирами ($5-15, $15-50, $50-150+), сегментация целевой аудитории с аватарами, CJM, воронки продаж (DM + без DM), финансовая модель с unit-экономикой и ROI по каналам, MVP scope, tech stack, риски и go/no-go рекомендация."""
result = run_intake(r26_query)
print(json.dumps(result, indent=2, ensure_ascii=False))Expected:
intent: niche_analysis with confidence 0.95+topic: short phrase about AI esotericstarget_audience: “женщины 25-45”monetization_model: subscription tiers mentioneddeliverables: list of 8-10 items including market_sizing, audience_avatars, customer_journey_map, sales_funnels, financial_model, unit_economics, roi_analysis, mvp_scope, tech_stack, risk_analysis, go_no_go_recommendationambiguities: possibly empty or 1-2 small items
Phase 3 — Integration test
Only after Phase 1+2 pass: run full M1 pipeline with intake step included. Verify:
brief.jsonexists in run artifacts- Pipeline completes successfully
- Director’s executive summary addresses every item in
brief.deliverables
This is manual validation by Denis via Streamlit, not automated.
Success criteria
- All existing tests still pass (537+ baseline)
- 6+ new tests in
test_agent_intake.pypass - REPL smoke test on R26 ТЗ produces 8+ correctly-named deliverables
-
brief.jsonsaved alongside other run artifacts - CEO receives StructuredBrief alongside raw_query
- Intel Director’s
_aggregate_corehas access to structured deliverables list - Degraded mode works: if intake fails, pipeline continues with basic brief
- No regression in pipeline timing (< 15s added to 24min pipeline)
- Commit pushed
Risks and mitigations
| Risk | Likelihood | Mitigation |
|---|---|---|
| Intake fails on weird queries (voice transcripts with filler words) | Medium | Degraded-mode fallback — pipeline continues with raw_query only |
| Adds latency to pipeline | Low | Sonnet single call 5-10s vs 24min pipeline — negligible |
| Canonical name list misses user concepts | Medium | Custom snake_case fallback — user-requested but novel deliverables still captured |
| CEO prompt needs significant rewrite | Medium | Keep CEO backward-compatible: if structured_brief is None, work with raw_query as before |
| Cost adds up | Low | 0.50/month |
| Intake introduces wrong intent classification, blocking niche analysis | Low-Medium | CEO ultimate decides what to do with intent — intake is advisory, not gating |
Rollback plan
Option 1 — Feature flag off:
export INTAKE_DISABLED=1Pipeline routes directly from user query to CEO, skipping intake. Runner logic must check flag.
Option 2 — Revert integration commit: Leaves agent_intake.py as dead code for later revival. CEO reverts to raw_query only.
Deliverables checklist
-
src/synth_brain/agents/agent_intake.pywith run_intake() + INTAKE_SYSTEM_PROMPT -
tests/test_agent_intake.pywith 6+ tests - Integration into pipeline orchestration (call intake before CEO)
-
brief.jsonsaved to run output directory - CEO prompt updated to receive structured_brief
- Intel Director prompt references brief.deliverables in Brief-Coverage Checklist
- INTAKE_DISABLED feature flag wired
- REPL smoke test on R26 query passes
- All tests passing (543+)
- Commit: “feat: Agent-Intake for structured brief extraction (closes Director missed-requirements gap)”
- Update manifest:
03-Roles/Agent-Intake.mdstatus proposed → shipped
Handoff to Claude Code
Phase 1 — Code + tests (this session)
Task: implement Phase 1 of Agent-Intake per spec at /home/developer/projects/manifest/07-Roadmap/Agent-Intake-Spec.md
Read spec first.
Inspection first:
1. grep for existing LLM provider abstraction: src/synth_brain/llm/providers/
2. Find where CEO is invoked in pipeline: grep -rn "run_ceo\|CEOAgent" src/
3. Find where user_query enters pipeline: likely src/synth_brain_ui/run_m1_query.py
Phase 1 deliverables:
1. Create src/synth_brain/agents/agent_intake.py with:
- INTAKE_SYSTEM_PROMPT constant
- run_intake() function with provider abstraction (anthropic default, openai via env)
- Degraded-mode fallback on errors
2. Create tests/test_agent_intake.py with 6+ tests
3. Verify: .venv/bin/python -m pytest tests/ 2>&1 | tail -15 (expect 543+ passed, 0 failed)
4. REPL smoke test:
- Call run_intake with R26 ТЗ (from spec)
- Verify result has intent=niche_analysis, 8+ deliverables including unit_economics, customer_journey_map, audience_avatars
5. Only if REPL + tests pass: commit "feat: Agent-Intake module - structured brief extraction"
Do NOT integrate into pipeline yet (Phase 2). Do NOT push.
Report back:
- Inspection findings (provider path, CEO location)
- Test output
- Commit hash
- REPL smoke test: intent, topic, number of deliverables, first 3 deliverable names
Phase 2 — Integration (next session)
Task: integrate Agent-Intake into M1 pipeline per spec Phase 2.
Changes:
1. Modify orchestration (run_m1_query.py or runner.py) to call run_intake before CEO
2. Save brief.json to run output directory
3. Update CEO to accept structured_brief parameter
4. Update Intel Director prompt to reference brief.deliverables
5. Add INTAKE_DISABLED env flag support
6. Run test suite
7. Commit "feat: integrate Agent-Intake into M1 pipeline"
Do NOT run M1 pipeline.
Additional considerations
When NOT to use Agent-Intake
- Direct API calls to M1 with already-structured input (future programmatic clients)
- Internal Claude-to-Claude calls for sub-research
- Follow-up queries in same session (brief already exists)
Future extensions
- Multi-turn clarification — if ambiguities are non-empty, ask user before running expensive pipeline
- Template briefs — user can select “niche analysis template” that pre-fills brief structure
- Brief versioning — if user runs same niche twice, compare briefs, note what changed
These are NOT in scope for this sprint. Parked.
Estimated duration:
- Phase 1 (code + tests): 3-4 hours
- Phase 2 (integration): 2-3 hours Total: 5-7 hours across 1-2 Claude Code sessions.
Good luck.