Template: Scenario

Reusable формат для описания cross-functional flows — последовательности действий с участием multiple agents + human touchpoints.

Скопируй в 04-Processes/Scenario-<slug>.md или включи в existing process doc.

Когда использовать

  • Cross-functional flow (несколько agents / departments involved)
  • Есть явная последовательность stages
  • Есть HITL checkpoints
  • Нужно явно документировать failure modes

Если это single-agent task — достаточно Template-Task. Если это quality check loop — достаточно Agent-Judge default process.

Frontmatter

---
title: "Scenario — [Name]"
type: scenario
status: proposed|active|deprecated
tags: [scenario, ...]
---

Секции

Summary

1-2 предложения. Что этот scenario делает, когда запускается, чем заканчивается.

Trigger

Что запускает scenario:

  • Event (e.g. hypothesis.proposed)
  • Schedule (e.g. weekly)
  • Human request
  • Другой scenario (chain)

Agents / Departments involved

List, в каком order первые появляются:

Stages

Таблица: step | actor | action | criticality.

#StageActorActionCriticality
1L1/L2/L3/…

HITL Checkpoints

Где human approval / interaction обязателен:

  • Stage N: [описание checkpoint, why human needed]

Decision Points

Branching: “если X → path A, иначе path B”.

Failure modes

Что идёт не так + mitigation:

  • Failure: [описание]
    • Detection: [как понимаем что сломалось]
    • Mitigation: [что делать]
    • Recovery: [как возвращаемся в good state]

KPI

Measurable metrics для оценки scenario успеха:

  • Success rate
  • Duration (p50, p95)
  • Cost per run
  • Specific domain metrics

SLA

End-to-end expectation:

  • Target: X hours от trigger до completion
  • Per-stage max duration
  • Escalation если exceeded

Open Questions

Нерешённые детали.

References

Связанные documents.


Example: Hypothesis-to-Validation (Synth Nova-native)

Заполненный пример для существующего Synth Nova flow.

Summary

От proposed гипотезы до first validation data (target ≤72h per Manifesto). Gate для “разрабатывать дальше vs kill / pivot”.

Trigger

Human (Max Nova) через Agent-CEO emit-ит hypothesis.proposed event с формой из Template-Hypothesis.

Agents / Departments involved

Stages

#StageActorActionCriticality
1Hypothesis intakeAgent-CEOParse intent, decomposeL1
2Department routingAgent-CEOSelect DirectorL1
3Child task creationAgent-CEOCreate tasks с success_criteria, kill_criteriaL2
4Task distributionDirectorAssign to ExecutorsL1
5Data gatheringExecutorsExecute researchL1-L2
6Quality checkAgent-JudgeReview findingsL1
7AggregationDirectorCombine results, confidence assessmentL2
8Persevere/Pivot/Kill recommendationAgent-CEOSynthesize, recommendL2 (propose)
9Human decisionFounderAccept / reject recommendationL3 decision by human
10Outcome labelingHITL-GatewayCapture decision + reasoningL1
11Memory persistenceMemory-ModelStore full traceL1

HITL Checkpoints

  • Stage 9: Founder decision mandatory. Agents не могут autonomously kill hypothesis — это product/strategy decision.
  • Escalation anytime: если confidence < 0.7 на Stage 5-8 → early escalate к Founder с partial data.

Decision Points

  • После Stage 7, based на aggregated confidence:
    • High confidence (>0.85) + favorable data → propose “persevere”
    • Moderate (0.7-0.85) → propose “pivot with refinement”
    • Low (<0.7) → propose “kill” с reasoning
    • Very low or kill_criteria met → auto-recommend “kill” без Chamber

Failure modes

  • Failure: Research data contradictory / insufficient

    • Detection: Agent-Judge flags low confidence, inter-source disagreement
    • Mitigation: Expand research scope, add sources, extend budget
    • Recovery: Return к Stage 5 с expanded task spec
  • Failure: Budget exceeded mid-scenario

    • Detection: Running cost > 80% allocated
    • Mitigation: Agent-CEO assesses — complete with current data vs expand budget (Founder approval) vs abort
    • Recovery: Partial results к Founder, explicit note “budget-constrained”
  • Failure: Stale data in memory

    • Detection: Memory refs older than configurable TTL
    • Mitigation: Re-research, invalidate stale refs
    • Recovery: Re-run affected stages
  • Failure: External tool unavailable (web search, etc.)

    • Detection: Tool call failures > retry threshold
    • Mitigation: Fallback tool OR partial completion с gap noted
    • Recovery: Retry with exponential backoff, if still fail → escalate

KPI

  • Primary: Time from trigger к Founder decision (target ≤72h per Manifesto)
  • Validation confidence delivered (target ≥0.70 or explicit note low confidence)
  • Cost per hypothesis (target reducing QoQ)
  • Human intervention rate outside planned Stage 9 (target <30%)
  • Kill-criteria hit rate (measure если criteria well-calibrated)

SLA

  • End-to-end: 72h target, 120h hard limit (after → auto-escalate partial)
  • Stage 5 (research) max: 48h
  • Stage 9 (Founder decision) max: 24h (via HITL-Gateway)

Open Questions

  • Persistent intent tracking (см. Agent-CEO open questions)
  • Handling sub-hypotheses (can generate within scenario)
  • Cross-hypothesis learning — sharing insights

References


Rules for scenario authors

  1. Один scenario = one coherent flow. Если описываешь “everything about research” — разделяй на multiple scenarios.
  2. Actor per stage — explicit. Не “the team” — конкретный agent или Founder.
  3. Criticality per stage. Помогает verify что routing consistent с Rules-Criticality.
  4. Failure modes обязательны. Scenario без failure modes = incomplete.
  5. KPI measurable. “Good outcomes” — не KPI. Duration, success rate, cost — KPI.
  6. Update when reality changes. Scenario — live document, как и остальной vault.

Связанные документы