Phase 2 Report Modules — Field Mapping Reference
Purpose: Precise data paths for each of the 6 Phase 2 slide modules based on validated R27 full.json structure. Use this document alongside Report-Generator-Spec when implementing modules.
Context: Validated via REPL against R27 artifacts at reports/streamlit_runs/20260420_154017_техническое_задание_проект_ai_эзотерика_провести_анализ_ниши/ on 2026-04-21. Structure confirmed by direct filesystem inspection.
Root structure of run directory
reports/streamlit_runs/<run_id>/
├── meta.json # Run metadata (query, timings, costs)
├── status.json # Completion status
├── judgement.json # Judge scores + verdict
├── full.json # PRIMARY — all agent outputs combined
├── report.md # Narrative report (not used by Report Generator)
├── activity.jsonl # Event log (not used by Report Generator)
├── brief.json # Agent-Intake output (if Agent-Intake deployed — future)
├── stage_*.json # Individual stage outputs (redundant with full.json)
└── pipeline.log # Runtime log (not used by Report Generator)
full.json top-level structure
{
"director_report": { ... }, // Intel Director's synthesized output
"scout_result": { ... }, // Scout's raw competitor research
"researcher_result": { ... } // Researcher's raw market data
}Important: Some data appears in both director_report and in scout_result or researcher_result. Director output is pre-processed and cleaner for rendering. Prefer director_report paths except where noted.
director_report keys (validated R27)
| Key | Used by module | Notes |
|---|---|---|
_meta | — | Internal metadata, not rendered |
stage | — | Processing stage indicator |
executive_summary | M10_ExecutiveSummary | Main exec summary text |
scorecards | M10_ExecutiveSummary | Weighted scores (Risk, Growth, Entry, Attractiveness) |
confidence_overall | M20_ResearchQuality, M10 (badge) | 0.00-1.00 float |
recommended_next_steps | M99_NextSteps | List of action items |
budget_allocation_note | M99_NextSteps | Budget distribution across next steps |
gaps_opportunities | M20_ResearchQuality | P0/P1/P2 knowledge gaps |
what_we_dont_know | M20_ResearchQuality, M10 (risks) | Research limitations |
key_findings | M10_ExecutiveSummary (opportunities) | Top 3-5 findings |
competitors | M40, M41 | Pre-processed competitor summaries |
market_analysis | M30 (cross-ref) | Director’s market summary |
audience_segments | M50_AudienceSegments | Segments with demographics |
financial_model | M80_UnitEconomics, M81_Scenarios | Unit economics + scenarios |
funnels | M60_FunnelAnalysis | CJM + drop-off analysis |
product_matrix | M70_PricingTiers | Tier structure with features |
content_strategy | (not in MVP) | Future module M110 |
sales_scripts | (not in MVP) | Future module M111 |
radar_chart_data | M10_ExecutiveSummary (optional) | Scorecard radar visualization |
scout_result keys (validated R27)
| Key | Used by module | Notes |
|---|---|---|
_meta | — | |
_output_method | — | |
competitors | (fallback) | Raw competitor list — use director_report.competitors first |
total_competitors_found | M40 header stat | Integer count |
search_intent | M20 | What Scout was looking for |
search_queries_used | M20 (detail) | Queries executed |
_source_quality | M20_ResearchQuality | Tier breakdown (Tier 1/2/3) |
gaps | M20_ResearchQuality | Gaps Scout identified |
confidence | M20 | Scout-specific confidence |
notes | (optional) | Free text annotations |
researcher_result keys (validated R27)
| Key | Used by module | Notes |
|---|---|---|
_meta | — | |
_output_method | — | |
_validation_warnings | — | Internal, not rendered |
market_size | M30_MarketSizing, M31_RegionalDistribution | PRIMARY SOURCE for TAM/SAM/SOM |
dynamics | M32_GrowthDrivers | CAGR + growth drivers |
segments | (cross-ref) | Can complement director_report.audience_segments |
pricing_analysis | M70_PricingTiers, M71_RegionalWTP | Regional WTP data |
sources | M20_ResearchQuality | Source list with tier classification |
confidence | M20 | Researcher-specific confidence |
key_findings | (cross-ref) | Complements director_report.key_findings |
what_we_dont_know | M20_ResearchQuality | Researcher-specific gaps |
judgement.json structure (validated R27)
{
"run_id": "20260420_154017",
"overall_score": 8.0, // float 0-10
"verdict": "PASS", // "PASS" | "CONDITIONAL GO" | "FAIL"
"stages": {
"director": { "score": 8.0, ... },
"scout": { "score": 7.0, ... },
"researcher": { "score": 8.0, ... },
"aggregate": { "score": 8.5, ... }
},
"blockers": [ ... ] // List of issues found
}Used by:
M01_Cover—overall_score,verdictM10_ExecutiveSummary—overall_score(big number badge)M20_ResearchQuality— per-stage scores fromstagesdict
meta.json structure (validated R27)
{
"run_id": "20260420_154017",
"query": "Техническое задание: проект AI-эзотерика...",
"started_at": "2026-04-20T15:40:19.676472+00:00",
"finished_at": "2026-04-20T16:04:14.664187+00:00",
"total_cost_usd": 2.681989,
"total_duration_seconds": 1434,
"total_tokens": 492735,
"compression_meta": { ... }
}Used by:
M01_Cover— topic extracted fromquery, run_id, finished_atM20_ResearchQuality— cost and duration breakdown
Module-by-module field mapping
For each Phase 2 module, exact data paths with extraction notes.
M01_Cover
File: src/synth_brain/reporting/modules/m01_cover.py
Section: cover
Priority: 100
is_available()
Always available if meta.json and judgement.json exist with minimum fields.
def is_available(self, ctx):
return bool(ctx.meta.get("query")) and bool(ctx.judgement.get("overall_score"))extract()
{
"topic": extract_topic_from_query(ctx.meta["query"]),
"judge_score": ctx.judgement["overall_score"],
"verdict": ctx.judgement.get("verdict", "N/A"),
"run_date": ctx.meta.get("finished_at", "")[:10], # YYYY-MM-DD
"sam": safe_get(ctx.full_json, "researcher_result.market_size", alias_groups={
"sam": ["sam_usd", "sam", "sam_value", "serviceable_addressable_market"]
}),
"cagr": safe_get(ctx.full_json, "researcher_result.dynamics", alias_groups={
"cagr": ["cagr", "cagr_pct", "growth_rate_cagr", "annual_growth_rate"]
}),
"target_audience_brief": extract_audience_brief(ctx.meta["query"]),
"ltv": safe_get(ctx.full_json, "director_report.financial_model", alias_groups={
"ltv": ["ltv_usd", "ltv", "lifetime_value", "ltv_premium_tier"]
}),
"cac": safe_get(ctx.full_json, "director_report.financial_model", alias_groups={
"cac": ["cac_usd", "cac", "customer_acquisition_cost", "cac_blended"]
}),
}Helper: extract_topic_from_query
User query is free text like “Техническое задание: проект AI-эзотерика…“. Extract 3-5 word topic phrase.
def extract_topic_from_query(query: str) -> str:
# Common patterns: "проект X", "ниша Y", "AI для Z"
for marker in ["проект ", "ниша ", "анализ ниши ", "AI-"]:
if marker in query.lower():
idx = query.lower().find(marker)
candidate = query[idx+len(marker):].split(".")[0].split(",")[0]
words = candidate.split()[:5]
return " ".join(words).strip()
# Fallback: first 40 chars
return query[:40].strip() + "..."render()
Title slide layout per Genspark reference (verified in PPTX inspection):
- Center-top: big topic title (48pt)
- Below: “Инвестиционный отчёт” subtitle (20pt muted)
- 6 KPI tiles in 3×2 grid below subtitle
- Footer: run_date, run_id small (10pt)
KPI tiles: Judge Score + /10, SAM, CAGR, Target Audience, LTV, CAC.
M10_ExecutiveSummary
File: src/synth_brain/reporting/modules/m10_executive_summary.py
Section: executive
Priority: 100
is_available()
def is_available(self, ctx):
return bool(ctx.full_json.get("director_report", {}).get("executive_summary"))extract()
dr = ctx.full_json["director_report"]
{
"title": "Executive Summary",
"judge_score": ctx.judgement.get("overall_score", 0),
"verdict": ctx.judgement.get("verdict", ""),
"weighted_score": safe_get(dr, "scorecards", alias_groups={
"weighted": ["weighted_score", "overall_weighted", "composite_score"]
}, default=0),
"confidence": dr.get("confidence_overall", 0),
"key_findings": dr.get("key_findings", [])[:5], # top 5
"risks": dr.get("what_we_dont_know", [])[:3],
"scorecards": safe_get(dr, "scorecards", alias_groups={
"risk": ["risk_score", "risk", "risk_assessment"],
"growth": ["growth_potential_score", "growth_potential", "growth_score"],
"entry": ["entry_difficulty_score", "entry_difficulty", "entry_score"],
"attractiveness": ["market_attractiveness_score", "market_attractiveness"],
}),
}render()
Two-column layout:
- Left: “Opportunities” (green accent) — weighted score + key_findings as bullets
- Right: “Risks” (amber accent) — confidence + risks as bullets
- Bottom center: verdict badge (GO green / CONDITIONAL GO amber / FAIL rose)
M99_NextSteps
File: src/synth_brain/reporting/modules/m99_next_steps.py
Section: verdict
Priority: 100
is_available()
def is_available(self, ctx):
return bool(ctx.full_json.get("director_report", {}).get("recommended_next_steps"))extract()
dr = ctx.full_json["director_report"]
{
"title": "Next Steps",
"verdict": ctx.judgement.get("verdict", ""),
"steps": dr.get("recommended_next_steps", [])[:5], # top 5
"budget_note": dr.get("budget_allocation_note", ""),
}Step structure (per R27 inspection)
Each step in recommended_next_steps has variable structure. Use alias groups:
step_alias_groups = {
"action": ["action", "step", "description", "title"],
"timeline": ["timeline", "duration", "weeks", "months", "time_estimate"],
"budget": ["budget", "budget_usd", "cost_usd", "estimated_cost"],
"kpi": ["kpi", "success_metric", "target", "outcome"],
"priority": ["priority", "p", "rank"],
}render()
- Title + verdict badge (center-top)
- 3-5 action cards in single column (stacked vertically)
- Each card: number + action + timeline + budget + KPI
- Card background color coded by priority (P0 blue / P1 muted)
M30_MarketSizing
File: src/synth_brain/reporting/modules/m30_market_sizing.py
Section: market
Priority: 100
is_available()
def is_available(self, ctx):
return bool(ctx.full_json.get("researcher_result", {}).get("market_size"))extract()
rr = ctx.full_json["researcher_result"]["market_size"]
{
"title": "Анализ рынка",
"subtitle": "TAM / SAM / SOM с источниками",
"tam": safe_get(rr, "", alias_groups={
"value": ["tam_usd", "tam", "tam_value", "total_addressable_market"]
}),
"sam": safe_get(rr, "", alias_groups={
"value": ["sam_usd", "sam", "sam_value", "serviceable_addressable_market"]
}),
"som": safe_get(rr, "", alias_groups={
"value": ["som_usd", "som", "som_value", "serviceable_obtainable_market"]
}),
"cagr": safe_get(ctx.full_json, "researcher_result.dynamics", alias_groups={
"cagr": ["cagr", "cagr_pct", "annual_growth_rate"]
}),
"regional": safe_get(rr, "regional_distribution", default=[]),
"sources": rr.get("sources", []) or ctx.full_json.get("researcher_result", {}).get("sources", []),
}render()
- Title + subtitle (top)
- Funnel visualization: TAM box → SAM box → SOM box (top to bottom, narrowing)
- CAGR badge (top-right corner)
- Regional distribution mini-chart (bottom)
- Sources count badge (bottom-right)
M50_AudienceSegments
File: src/synth_brain/reporting/modules/m50_audience_segments.py
Section: audience
Priority: 100
is_available()
def is_available(self, ctx):
segments = ctx.full_json.get("director_report", {}).get("audience_segments", [])
return len(segments) > 0extract()
segments_raw = ctx.full_json["director_report"].get("audience_segments", [])
segments = []
for seg in segments_raw[:3]: # Max 3 segments in one slide
segments.append({
"name": safe_get(seg, "", alias_groups={
"name": ["name", "segment_name", "title", "label"]
}),
"sam_share": safe_get(seg, "", alias_groups={
"share": ["sam_share_pct", "sam_share", "share_of_sam", "percentage"]
}),
"size_usd": safe_get(seg, "", alias_groups={
"size": ["size_usd", "size", "market_size_usd", "segment_size"]
}),
"age_range": safe_get(seg, "demographics", alias_groups={
"age": ["age", "age_range", "age_group"]
}),
"income": safe_get(seg, "demographics", alias_groups={
"income": ["income", "income_usd", "income_range"]
}),
"arpu": safe_get(seg, "", alias_groups={
"arpu": ["arpu_monthly", "arpu", "revenue_per_user"]
}),
"jtbd": safe_get(seg, "", alias_groups={
"jtbd": ["jtbd", "job_to_be_done", "primary_need"]
}),
"pain_points": safe_get(seg, "", alias_groups={
"pain": ["pain_points", "pains", "frustrations"]
}, default=[]),
})
{
"title": "Сегменты аудитории",
"subtitle": "Демография, JTBD, boli",
"segments": segments,
}render()
3-column card layout:
- Each card 4” wide × 5.5” tall
- Top: segment name + SAM share %
- Middle: demographics (age, income)
- Bottom: JTBD in italics, 2-3 pain points as bullets
- ARPU badge in top-right corner of each card
M80_UnitEconomics
File: src/synth_brain/reporting/modules/m80_unit_economics.py
Section: financial
Priority: 100
is_available()
def is_available(self, ctx):
fm = ctx.full_json.get("director_report", {}).get("financial_model", {})
return bool(fm)extract() — CRITICAL: use alias groups extensively
This is where the R27→R28 rendering bug happened. LLM produces 20+ different field name variants for the same 6 metrics. Study section_renderers.py:165-172 commit f5f8644 for the reference implementation.
fm = ctx.full_json["director_report"]["financial_model"]
{
"title": "Юнит-экономика",
"subtitle": "CAC / LTV / ratio по каналам и тирам",
"cac_blended": safe_get(fm, "", alias_groups={
"cac": ["cac_usd", "cac", "customer_acquisition_cost", "cac_blended", "cac_avg"]
}),
"cac_by_channel": safe_get(fm, "", alias_groups={
"by_channel": ["cac_by_channel", "cac_channel_breakdown", "acquisition_costs"]
}, default={}),
"ltv_by_tier": safe_get(fm, "", alias_groups={
"by_tier": ["ltv_by_tier", "ltv_tier_breakdown", "lifetime_values"]
}, default={}),
"ltv_blended": safe_get(fm, "", alias_groups={
"ltv": ["ltv_usd", "ltv", "lifetime_value", "ltv_blended", "ltv_avg"]
}),
"ratio": safe_get(fm, "", alias_groups={
"ratio": ["ltv_cac_ratio", "ltv_to_cac", "cac_ltv_ratio"]
}),
"gross_margin": safe_get(fm, "", alias_groups={
"margin": ["gross_margin_pct", "gross_margin", "margin_pct"]
}),
"payback_months": safe_get(fm, "", alias_groups={
"payback": ["payback_months", "cac_payback", "payback_period"]
}),
}render()
- Title + subtitle
- Big LTV/CAC ratio display (top center, 48pt)
- Left column: CAC per channel (table)
- Right column: LTV per tier (table)
- Bottom row: 3 stat tiles (gross_margin, payback_months, breakeven_month)
- Color code ratio badge: green if ≥ 3.0, amber if 1.5-3.0, rose if < 1.5
Shared helper — safe_get with alias_groups
Required utility in src/synth_brain/reporting/modules/base.py or a new utils.py:
def safe_get(obj, path, alias_groups=None, default=None):
"""
Safe nested field access with alias fallback.
Args:
obj: dict or nested dict
path: dot-separated path, e.g., "researcher_result.market_size"
Empty string means obj itself.
alias_groups: dict of field_name -> list of alias names
e.g., {"sam": ["sam_usd", "sam", "serviceable_addressable_market"]}
Returns value from first alias that matches.
default: value if no match found
Returns:
Found value or default
"""
if not obj:
return default
# Navigate path
current = obj
if path:
for part in path.split("."):
if isinstance(current, dict) and part in current:
current = current[part]
else:
return default
# If no alias_groups, return current
if not alias_groups:
return current if current else default
# Try each alias group
if not isinstance(current, dict):
return default
# For single-field alias groups, return the first match
if len(alias_groups) == 1:
key, aliases = next(iter(alias_groups.items()))
for alias in aliases:
if alias in current and current[alias]:
return current[alias]
return default
# For multi-field alias groups, return dict of {canonical_name: value}
result = {}
for canonical, aliases in alias_groups.items():
for alias in aliases:
if alias in current and current[alias]:
result[canonical] = current[alias]
break
return result or defaultTesting per module
Each module needs 3 tests (per Report-Generator-Spec):
# tests/test_reporting_modules.py
import pytest
from pathlib import Path
from synth_brain.reporting.modules.m01_cover import M01_Cover
from synth_brain.reporting.generator import _load_module_context
R27_DIR = Path("reports/streamlit_runs/20260420_154017_техническое_задание_проект_ai_эзотерика_провести_анализ_ниши")
def test_m01_cover_available_on_r27():
ctx = _load_module_context(R27_DIR)
module = M01_Cover()
assert module.is_available(ctx) is True
def test_m01_cover_unavailable_on_empty_context():
from synth_brain.reporting.modules.base import ModuleContext
ctx = ModuleContext(full_json={}, m2_verification=None, m2_scoring=None,
chamber_transcripts=None, outcome_history=None, meta={}, judgement={})
module = M01_Cover()
assert module.is_available(ctx) is False
def test_m01_cover_renders_without_error():
from pptx import Presentation
ctx = _load_module_context(R27_DIR)
module = M01_Cover()
data = module.extract(ctx)
assert "topic" in data
prs = Presentation()
slide = prs.slides.add_slide(prs.slide_layouts[6])
module.render(slide, data) # should not raise18 tests total (6 modules × 3 tests each).
Implementation ordering
Implement modules in this order (simplest → most complex data):
- M01_Cover (simple fields, establishes pattern)
- M99_NextSteps (iterates array of steps, uses alias groups)
- M10_ExecutiveSummary (most scorecards, most complex rendering)
- M30_MarketSizing (researcher_result path, regional distribution)
- M50_AudienceSegments (3-column layout, nested segment data)
- M80_UnitEconomics (MUST use alias groups extensively — R27→R28 bug source)
After each module:
- Run its 3 tests
- Generate full deck on R27 to verify accumulation works
- Visual spot-check (optional: convert PPTX to PNG via LibreOffice headless)
Known data variations R27 vs R28
R28 has completed_partial status with director_report.executive_summary missing. When testing Phase 2 modules, verify graceful behavior:
- M10_ExecutiveSummary should return False from is_available() → slide not generated
- M01_Cover should still work (meta and judgement exist on R28? — need verification)
Pattern: every module checks data presence explicitly, never assumes.
Links
- Report-Generator-Spec — parent spec
- R28-Timeout-Incident — context for partial-run handling
- Commit
f5f8644in synth-brain — reference implementation of alias groups insection_renderers.py:165-172 - Commit
52a76cfin synth-brain — Phase 1 scaffold (base class, generator, design system)