Skip to main content

Agent Graph

This page describes the text agent runtime used by the CLI, web text chat, and synchronous/streaming chat APIs. Voice uses OpenAI Realtime as its live speech loop and records finalized transcripts back into the shared runtime; see Voice for that path.

Every text message follows the same pipeline: safety check first, then a safe-turn dispatcher for memory commands or factual lookups, then memory retrieval and response generation. Long-term memory writes are handled by explicit memory commands and the shared session-end path. Click any step below to see what it does.

Click any step to expand. The pipeline shows the safety-first runtime topology.


Topology

The text runtime keeps branch decisions explicit through an app-owned TextTurnGraph. The graph resolves one typed route plan, then delegates response generation to the selected OpenAI Agents SDK specialist executor. The crisis gate runs first, deterministic dispatch handles memory-control and grounded lookup branches, and safe therapeutic turns enter the TherapeuticAgent. Post-response extractors remain runtime-managed side effects.

OpenAITextRuntimeruntime routingSDK RunnerSDK session memoryidle
START
build_initial_state
crisis_gate
LLM-only safety classifier
crisis gate route
Crisis branch
crisis_resource_lookup
location-aware resources
crisis_response
PFA overlay
crisis_log
audit record
Safe-turn branch
turn_dispatch
memory · lookup · support
memory_control
saved-memory command
grounded_lookup
factual lookup
load_memory
hybrid RRF retrieval
TherapeuticAgent
response | guided exercise
finalization
assistant turn
END
runtime response returned

The full execution path:

START
→ crisis_gate
├─ crisis_resource_lookup → CrisisAgent → crisis_log → finalization
└─ turn_dispatch
├─ memory_control → finalization
├─ grounded_lookup → finalization
└─ load_memory → TherapeuticAgent / GuidedExerciseAgent → finalization

finalization → END

Runtime Stage Responsibilities

Runtime stages own deterministic state deltas, persistence boundaries, and tool-result validation. Specialist agents own natural-language response generation and the tools assigned to their branch.

The route plan is shared by normal, streaming, and shadow execution so those entrypoints do not each reimplement the branch table.

Voice reuses the service implementations behind those tools, but not this text-stage topology. Realtime function tools call the backend service layer directly through /api/voice/realtime/tools.

Therapeutic Response Selection

The TherapeuticAgent selects exactly one therapeutic response style per turn:

Response styleWhen it fires
supportiveDefault — user sharing feelings, seeking support, or greeting
reflectiveUser describing a recurring pattern they've already named
clarifyingAmbiguous message — agent needs context before responding
psychoeducationUser describes a reaction AND seeks understanding
techniqueUser wants structured therapeutic work without launching a named exercise
guided_exerciseUser requests a structured exercise (13 exercises)
closingUser signals wind-down ("I should go", "thanks, this helped")

The style decision is LLM-owned with local validation around active exercise state. When a guided exercise is active, the dispatcher preserves the pinned exercise_therapeutic_approach for exercise continuations and narrow clarifying turns. Psychoeducation side-turns keep the exercise active, but may use a fresh top-level approach for the explanation.

Design principles

Safety is sequential. The crisis gate runs before any memory retrieval or response generation. There is no path through the runtime that loads context without first passing the safety check.

Operational dispatch runs before memory loading. After the crisis gate, turn_dispatch routes explicit saved-memory commands to memory control, explicit factual lookup requests to grounded lookup, and ordinary support to memory loading plus the therapeutic agent.

Routing is LLM-owned, locally validated. Classifiers own natural language decisions; plain services enforce schema validity, active-flow consistency, confirmation gates, and truth-table normalization. Missing or failing LLM calls retry or surface instead of silently degrading to regex.

Post-response memory evaluation. Semantic and procedural extractors run after turn finalization as runtime-managed side effects. Some candidates commit immediately; others are buffered for session end or dropped. Extractor diagnostics are merged into the final turn output.

Reducer-backed accumulation. history and transcript use operator.add reducers. Each turn emits only the new entries (user turn at init, assistant turn at finalize); the checkpointer accumulates prior turns automatically. No get_history() calls on the hot path.

Structured working memory. The load-memory stage returns raw WorkingMemoryEntry dicts (semantic and episodic types). Formatting happens on demand at prompt-build time and CLI render time — not inside the response model.

Defense-in-depth retry. I/O branches keep explicit provider and storage failure handling at the runtime boundary so failures retry or surface instead of being hidden by fallback text.

Per-turn diagnostics. Runtime stages and side effects stamp timing and decision metadata into state["diagnostics"] via the _merge_dicts reducer. Keys include load_memory_ms, crisis_gate_ms, and turn_total_ms.

Runtime context

Dependencies are injected via WorkflowContext, a frozen dataclass with attribute access (runtime.context.llm_client, not dict access). Fields: llm_client, memory_store, crisis_log_backend, memory_mode, response_llm, embedding_provider, session_memory_buffer. See Runtime for the full breakdown.

Key files

FileWhat it does
agent/runtime/turn.pyCompatibility entrypoint for build_initial_state(), state_to_output(), and run_agent()
agent/state.pyAgentState TypedDict with reducer annotations and split input/output schemas
agent/memory/entries.pyWorkingMemoryEntry types, factory functions, formatters
agent/runtime/workflow_context.pyWorkflowContext frozen dataclass
agent/runtime/openai_text_runtime.pyOpenAI Agents SDK text runtime orchestration
agent/specialists/Triage, therapeutic, crisis, and guided-exercise specialist agent definitions
agent/tools/Agent-owned memory, grounded lookup, crisis lookup, therapeutic response skill, and guided-exercise tools
agent/skills/therapeutic_response.pyResponse-style guidance rendered as prompt-local SDK skill context
agent/specialists/therapeutic_response/prompts.pyTherapeutic response prompt composition owned by the therapeutic agent
agent/specialists/guided_exercise.pyGuided-exercise agent instructions and system prompt ownership
agent/skills/guided_exercises/Guided-exercise definitions, selection, state transitions, memory helpers, and skill rendering
agent/runtime/memory_context.pyRuntime-owned turn-memory loading backed by agent/memory/retrieval/service.py
agent/guardrails/Crisis input guardrail, classifier service, prompts, and safety delta construction
agent/tools/crisis.pyCrisis resource lookup tool and response delta
agent/memory/control/service.pyExecutes memory commands (list, forget, recall toggle, save preference)
agent/tools/grounded.pySearch-grounded factual replies
agent/runtime/session/commit.pySession-end promotion path backed by agent/memory/commit/service.py
agent/runtime/session/summarize.pySession-end episodic summarization path backed by agent/memory/operations/episodic.py
agent/runtime/turn_finalization.pyTranscript finalization (single-element delta)
agent/audit/crisis_log.pyAudit-log write helper, backend protocol, and in-memory implementations
agent/audit/postgres_crisis_log.pyPrimary durable crisis log with retention purge
agent/audit/sqlite_crisis_log.pySQLite crisis-log fallback backend
agent/runtime/runtime.pyPersistentAgentRuntime — configured persistent thread state

Adding a new therapeutic response style

  1. Create agent/prompts/sources/response_styles/your_style.md with the response style's knowledge content
  2. Add prompt-source wiring in agent/specialists/therapeutic_response/prompt_sources.py and instructions in agent/specialists/therapeutic_response/prompt_instructions.py
  3. Update agent/skills/therapeutic_response.py if the style needs different prompt-local skill metadata
  4. Update load_therapeutic_response_skill guidance if the therapeutic response agent needs clearer selection criteria for the new style
  5. Add or update backend tests for the route and state contract
  6. Add a STAGE_LABELS entry in agent/models.py if the runtime stage name needs a distinct user-facing label