Agent Graph
This page describes the text agent runtime used by the CLI, web text chat, and synchronous/streaming chat APIs. Voice uses OpenAI Realtime as its live speech loop and records finalized transcripts back into the shared runtime; see Voice for that path.
Every text message follows the same pipeline: safety check first, then a safe-turn dispatcher for memory commands or factual lookups, then memory retrieval and response generation. Long-term memory writes are handled by explicit memory commands and the shared session-end path. Click any step below to see what it does.
Click any step to expand. The pipeline shows the safety-first runtime topology.
Topology
The text runtime keeps branch decisions explicit through an app-owned
TextTurnGraph. The graph resolves one typed route plan, then delegates response
generation to the selected OpenAI Agents SDK specialist executor. The crisis gate
runs first, deterministic dispatch handles memory-control and grounded lookup
branches, and safe therapeutic turns enter the TherapeuticAgent. Post-response
extractors remain runtime-managed side effects.
The full execution path:
START
→ crisis_gate
├─ crisis_resource_lookup → CrisisAgent → crisis_log → finalization
└─ turn_dispatch
├─ memory_control → finalization
├─ grounded_lookup → finalization
└─ load_memory → TherapeuticAgent / GuidedExerciseAgent → finalization
finalization → END
Runtime Stage Responsibilities
Runtime stages own deterministic state deltas, persistence boundaries, and tool-result validation. Specialist agents own natural-language response generation and the tools assigned to their branch.
The route plan is shared by normal, streaming, and shadow execution so those entrypoints do not each reimplement the branch table.
Voice reuses the service implementations behind those tools, but not this
text-stage topology. Realtime function tools call the backend service layer
directly through /api/voice/realtime/tools.
Therapeutic Response Selection
The TherapeuticAgent selects exactly one therapeutic response style per turn:
| Response style | When it fires |
|---|---|
supportive | Default — user sharing feelings, seeking support, or greeting |
reflective | User describing a recurring pattern they've already named |
clarifying | Ambiguous message — agent needs context before responding |
psychoeducation | User describes a reaction AND seeks understanding |
technique | User wants structured therapeutic work without launching a named exercise |
guided_exercise | User requests a structured exercise (13 exercises) |
closing | User signals wind-down ("I should go", "thanks, this helped") |
The style decision is LLM-owned with local validation around active
exercise state. When a guided exercise is active, the dispatcher preserves the pinned
exercise_therapeutic_approach for exercise continuations and narrow
clarifying turns. Psychoeducation side-turns keep the exercise active, but may
use a fresh top-level approach for the explanation.
Design principles
Safety is sequential. The crisis gate runs before any memory retrieval or response generation. There is no path through the runtime that loads context without first passing the safety check.
Operational dispatch runs before memory loading. After the crisis
gate, turn_dispatch routes explicit saved-memory commands to
memory control, explicit factual lookup requests to grounded lookup,
and ordinary support to memory loading plus the therapeutic agent.
Routing is LLM-owned, locally validated. Classifiers own natural language decisions; plain services enforce schema validity, active-flow consistency, confirmation gates, and truth-table normalization. Missing or failing LLM calls retry or surface instead of silently degrading to regex.
Post-response memory evaluation. Semantic and procedural extractors run after turn finalization as runtime-managed side effects. Some candidates commit immediately; others are buffered for session end or dropped. Extractor diagnostics are merged into the final turn output.
Reducer-backed accumulation. history and transcript use
operator.add reducers. Each turn emits only the new entries
(user turn at init, assistant turn at finalize); the checkpointer
accumulates prior turns automatically. No get_history() calls
on the hot path.
Structured working memory. The load-memory stage returns raw
WorkingMemoryEntry dicts (semantic and episodic types). Formatting
happens on demand at prompt-build time and CLI render time — not
inside the response model.
Defense-in-depth retry. I/O branches keep explicit provider and storage failure handling at the runtime boundary so failures retry or surface instead of being hidden by fallback text.
Per-turn diagnostics. Runtime stages and side effects stamp
timing and decision metadata into state["diagnostics"] via the
_merge_dicts reducer. Keys include load_memory_ms,
crisis_gate_ms, and turn_total_ms.
Runtime context
Dependencies are injected via WorkflowContext, a frozen dataclass
with attribute access (runtime.context.llm_client, not dict access).
Fields: llm_client, memory_store, crisis_log_backend,
memory_mode, response_llm, embedding_provider,
session_memory_buffer. See Runtime
for the full breakdown.
Key files
| File | What it does |
|---|---|
agent/runtime/turn.py | Compatibility entrypoint for build_initial_state(), state_to_output(), and run_agent() |
agent/state.py | AgentState TypedDict with reducer annotations and split input/output schemas |
agent/memory/entries.py | WorkingMemoryEntry types, factory functions, formatters |
agent/runtime/workflow_context.py | WorkflowContext frozen dataclass |
agent/runtime/openai_text_runtime.py | OpenAI Agents SDK text runtime orchestration |
agent/specialists/ | Triage, therapeutic, crisis, and guided-exercise specialist agent definitions |
agent/tools/ | Agent-owned memory, grounded lookup, crisis lookup, therapeutic response skill, and guided-exercise tools |
agent/skills/therapeutic_response.py | Response-style guidance rendered as prompt-local SDK skill context |
agent/specialists/therapeutic_response/prompts.py | Therapeutic response prompt composition owned by the therapeutic agent |
agent/specialists/guided_exercise.py | Guided-exercise agent instructions and system prompt ownership |
agent/skills/guided_exercises/ | Guided-exercise definitions, selection, state transitions, memory helpers, and skill rendering |
agent/runtime/memory_context.py | Runtime-owned turn-memory loading backed by agent/memory/retrieval/service.py |
agent/guardrails/ | Crisis input guardrail, classifier service, prompts, and safety delta construction |
agent/tools/crisis.py | Crisis resource lookup tool and response delta |
agent/memory/control/service.py | Executes memory commands (list, forget, recall toggle, save preference) |
agent/tools/grounded.py | Search-grounded factual replies |
agent/runtime/session/commit.py | Session-end promotion path backed by agent/memory/commit/service.py |
agent/runtime/session/summarize.py | Session-end episodic summarization path backed by agent/memory/operations/episodic.py |
agent/runtime/turn_finalization.py | Transcript finalization (single-element delta) |
agent/audit/crisis_log.py | Audit-log write helper, backend protocol, and in-memory implementations |
agent/audit/postgres_crisis_log.py | Primary durable crisis log with retention purge |
agent/audit/sqlite_crisis_log.py | SQLite crisis-log fallback backend |
agent/runtime/runtime.py | PersistentAgentRuntime — configured persistent thread state |
Adding a new therapeutic response style
- Create
agent/prompts/sources/response_styles/your_style.mdwith the response style's knowledge content - Add prompt-source wiring in
agent/specialists/therapeutic_response/prompt_sources.pyand instructions inagent/specialists/therapeutic_response/prompt_instructions.py - Update
agent/skills/therapeutic_response.pyif the style needs different prompt-local skill metadata - Update
load_therapeutic_response_skillguidance if the therapeutic response agent needs clearer selection criteria for the new style - Add or update backend tests for the route and state contract
- Add a
STAGE_LABELSentry inagent/models.pyif the runtime stage name needs a distinct user-facing label