Agent Graph

This page describes the text agent runtime used by the CLI, web text chat, and synchronous/streaming chat APIs. Voice uses OpenAI Realtime as its live speech loop and records finalized transcripts back into the shared runtime; see Voice for that path.

Every text message follows the same pipeline: safety check first, then a safe-turn dispatcher for memory commands or factual lookups, then memory retrieval and response generation. Long-term memory writes are handled by explicit memory commands and the shared session-end path. Click any step below to see what it does.

Click any step to expand. The pipeline shows the safety-first runtime topology.

Topology

The text runtime keeps branch decisions explicit through an app-owned TextTurnGraph. The graph resolves one typed route plan, then delegates response generation to the selected OpenAI Agents SDK specialist executor. The crisis gate runs first, deterministic dispatch handles memory-control and grounded lookup branches, and safe therapeutic turns enter the TherapeuticAgent. Post-response extractors remain runtime-managed side effects.

OpenAITextRuntimeruntime routingSDK RunnerSDK session memoryidle

START

build_initial_state

crisis_gate

LLM-only safety classifier

crisis gate route

Crisis branch

crisis_resource_lookup

location-aware resources

crisis_response

PFA overlay

crisis_log

audit record

Safe-turn branch

turn_dispatch

memory · lookup · support

memory_control

saved-memory command

grounded_lookup

factual lookup

load_memory

hybrid RRF retrieval

TherapeuticAgent

response | guided exercise

finalization

assistant turn

END

runtime response returned

waiting for message

The full execution path:

START
  → crisis_gate
     ├─ crisis_resource_lookup → CrisisAgent → crisis_log → finalization
     └─ turn_dispatch
        ├─ memory_control → finalization
        ├─ grounded_lookup → finalization
        └─ load_memory → TherapeuticAgent / GuidedExerciseAgent → finalization

finalization → END

Runtime Stage Responsibilities

Runtime stages own deterministic state deltas, persistence boundaries, and tool-result validation. Specialist agents own natural-language response generation and the tools assigned to their branch.

The route plan is shared by normal, streaming, and shadow execution so those entrypoints do not each reimplement the branch table.

Voice reuses the service implementations behind those tools, but not this text-stage topology. Realtime function tools call the backend service layer directly through /api/voice/realtime/tools.

Therapeutic Response Selection

The TherapeuticAgent selects exactly one therapeutic response style per turn:

Response style	When it fires
`supportive`	Default — user sharing feelings, seeking support, or greeting
`reflective`	User describing a recurring pattern they've already named
`clarifying`	Ambiguous message — agent needs context before responding
`psychoeducation`	User describes a reaction AND seeks understanding
`technique`	User wants structured therapeutic work without launching a named exercise
`guided_exercise`	User requests a structured exercise (13 exercises)
`closing`	User signals wind-down ("I should go", "thanks, this helped")

The style decision is LLM-owned with local validation around active exercise state. When a guided exercise is active, the dispatcher preserves the pinned exercise_therapeutic_approach for exercise continuations and narrow clarifying turns. Psychoeducation side-turns keep the exercise active, but may use a fresh top-level approach for the explanation.

Design principles

Safety is sequential. The crisis gate runs before any memory retrieval or response generation. There is no path through the runtime that loads context without first passing the safety check.

Operational dispatch runs before memory loading. After the crisis gate, turn_dispatch routes explicit saved-memory commands to memory control, explicit factual lookup requests to grounded lookup, and ordinary support to memory loading plus the therapeutic agent.

Routing is LLM-owned, locally validated. Classifiers own natural language decisions; plain services enforce schema validity, active-flow consistency, confirmation gates, and truth-table normalization. Missing or failing LLM calls retry or surface instead of silently degrading to regex.

Post-response memory evaluation. Semantic and procedural extractors run after turn finalization as runtime-managed side effects. Some candidates commit immediately; others are buffered for session end or dropped. Extractor diagnostics are merged into the final turn output.

Reducer-backed accumulation. history and transcript use operator.add reducers. Each turn emits only the new entries (user turn at init, assistant turn at finalize); the checkpointer accumulates prior turns automatically. No get_history() calls on the hot path.

Structured working memory. The load-memory stage returns raw WorkingMemoryEntry dicts (semantic and episodic types). Formatting happens on demand at prompt-build time and CLI render time — not inside the response model.

Defense-in-depth retry. I/O branches keep explicit provider and storage failure handling at the runtime boundary so failures retry or surface instead of being hidden by fallback text.

Per-turn diagnostics. Runtime stages and side effects stamp timing and decision metadata into state["diagnostics"] via the _merge_dicts reducer. Keys include load_memory_ms, crisis_gate_ms, and turn_total_ms.

Runtime context

Dependencies are injected via WorkflowContext, a frozen dataclass with attribute access (runtime.context.llm_client, not dict access). Fields: llm_client, memory_store, crisis_log_backend, memory_mode, response_llm, embedding_provider, session_memory_buffer. See Runtime for the full breakdown.

Key files

File	What it does
`agent/runtime/turn.py`	Compatibility entrypoint for `build_initial_state()`, `state_to_output()`, and `run_agent()`
`agent/state.py`	`AgentState` TypedDict with reducer annotations and split input/output schemas
`agent/memory/entries.py`	`WorkingMemoryEntry` types, factory functions, formatters
`agent/runtime/workflow_context.py`	`WorkflowContext` frozen dataclass
`agent/runtime/openai_text_runtime.py`	OpenAI Agents SDK text runtime orchestration
`agent/specialists/`	Triage, therapeutic, crisis, and guided-exercise specialist agent definitions
`agent/tools/`	Agent-owned memory, grounded lookup, crisis lookup, therapeutic response skill, and guided-exercise tools
`agent/skills/therapeutic_response.py`	Response-style guidance rendered as prompt-local SDK skill context
`agent/specialists/therapeutic_response/prompts.py`	Therapeutic response prompt composition owned by the therapeutic agent
`agent/specialists/guided_exercise.py`	Guided-exercise agent instructions and system prompt ownership
`agent/skills/guided_exercises/`	Guided-exercise definitions, selection, state transitions, memory helpers, and skill rendering
`agent/runtime/memory_context.py`	Runtime-owned turn-memory loading backed by `agent/memory/retrieval/service.py`
`agent/guardrails/`	Crisis input guardrail, classifier service, prompts, and safety delta construction
`agent/tools/crisis.py`	Crisis resource lookup tool and response delta
`agent/memory/control/service.py`	Executes memory commands (list, forget, recall toggle, save preference)
`agent/tools/grounded.py`	Search-grounded factual replies
`agent/runtime/session/commit.py`	Session-end promotion path backed by `agent/memory/commit/service.py`
`agent/runtime/session/summarize.py`	Session-end episodic summarization path backed by `agent/memory/operations/episodic.py`
`agent/runtime/turn_finalization.py`	Transcript finalization (single-element delta)
`agent/audit/crisis_log.py`	Audit-log write helper, backend protocol, and in-memory implementations
`agent/audit/postgres_crisis_log.py`	Primary durable crisis log with retention purge
`agent/audit/sqlite_crisis_log.py`	SQLite crisis-log fallback backend
`agent/runtime/runtime.py`	`PersistentAgentRuntime` — configured persistent thread state

Adding a new therapeutic response style

Create agent/prompts/sources/response_styles/your_style.md with the response style's knowledge content
Add prompt-source wiring in agent/specialists/therapeutic_response/prompt_sources.py and instructions in agent/specialists/therapeutic_response/prompt_instructions.py
Update agent/skills/therapeutic_response.py if the style needs different prompt-local skill metadata
Update load_therapeutic_response_skill guidance if the therapeutic response agent needs clearer selection criteria for the new style
Add or update backend tests for the route and state contract
Add a STAGE_LABELS entry in agent/models.py if the runtime stage name needs a distinct user-facing label

Topology​

Runtime Stage Responsibilities​

Therapeutic Response Selection​

Design principles​

Runtime context​

Key files​

Adding a new therapeutic response style​