Context Management

Every text turn, the runtime retrieves long-term memory, relies on the SDK session for short-term conversation continuity, and assembles the context that the OpenAI Agents SDK runner will see. Voice uses a compact memory bootstrap at Realtime session start plus transcript recording at turn end; see Voice Persistence.

Per-turn retrieval

The runtime turn memory context runs before ordinary therapeutic responses. Crisis, memory-control, and grounded lookup paths can skip or ignore the retrieval result. When it runs, it populates:

Output	What it contains	Retrieval method
`working_memory`	`WorkingMemoryEntry` dicts (semantic + episodic)	Hybrid RRF (embedding + token-recall)
`session_memory.summary`	Per-turn human-readable summary of what was retrieved	Written by turn memory context
`procedural_profile.procedural_rules`	Style directives from `ProceduralProfile`	Full profile load (not query-based)
`procedural_profile.proactive_recall_enabled`	Recall toggle	Read from procedural profile
`diagnostics`	Hit counts, store sizes, retrieval path	Written via `_merge_dicts` reducer

First-turn catch-up

On the first turn of a new session (transcript length = 1), the most recent episodic arc is automatically injected as a catch-up entry — the "last time we talked..." experience — regardless of query match.

The retrieval pipeline runs semantic and episodic searches in parallel (asyncio.gather), then loads the procedural profile. Episodic results are deduped by (summary, primary_themes) so the same arc never appears twice on a single turn.

What the response node sees

Click a step to see what it does.

Field	Source	Per turn?
`message`	User input	Yes
`history` / `transcript`	Checkpointer + `operator.add` reducer	Accumulated across turns
`working_memory`	Turn memory context	Re-retrieved each turn
`procedural_profile`	Turn memory context	Re-loaded each turn (merged via `_merge_dicts`)
`crisis` / `crisis_audit`	Crisis gate	Fresh each turn
`response_style`, `therapeutic_approach`	Dispatcher	Fresh each turn
`session_progress`	Checkpointer + `_merge_dicts` reducer	`turn_count` increments while preserving sibling fields
`exercise_state`	`guided_exercise_node` + dispatcher, `_merge_dicts`	`exercise_therapeutic_approach` stores the approach pinned at exercise start; active exercise state survives side-turns

Exercise approach continuity

A guided exercise that started under, say, the act approach should keep that approach when guided exercise instructions resume. The exercise_therapeutic_approach field on exercise_state stores the pinned approach at exercise start. The guided-exercise runtime reuses it when an active exercise routes back to guided_exercise. A side-turn preserves the active exercise state while the therapeutic response skill may choose a fresh top-level approach for the explanation. Only an explicit exit clears exercise_therapeutic_approach alongside exercise_type and exercise_step.

This is enforced by agent/skills/guided_exercises/ and the runtime-owned exercise state checks in agent/runtime/openai_text_runtime.py, not by a separate post-processing selector.

Session stage

›

Not yet dynamic

Dynamic stage inference (opening → deepening → working → closing) is a planned enhancement. The knowledge file (agent/prompts/sources/session_stages.md) exists, but stage inference is not wired into runtime state or any active runtime stage.

Prompt injection by response style

Not every response style sees the same context. Click a cell to see which context fields are injected into each response prompt.

Response guidance

Stage guidance

Session context

Recent history

Current message

supportive

✓

reflective

✓

clarifying

✓

–

✓

psychoeducation

✓

technique

✓

guided_exercise

✓

closing

✓

safety_check

✓

–

✓

crisis_response

✓

–

✓

memory_control

–

✓

grounded_lookup

–

✓

Diagnostics

Turn memory context stamps the following diagnostics so each turn's retrieval is auditable from CLI / Opik:

Key	What it reports
`load_memory_ms`	Total wall time including all parallel queries
`semantic_hits` / `semantic_store_size`	Returned vs. stored semantic facts
`episodic_hits` / `episodic_store_size`	Returned vs. stored episodic arcs
`procedural_count`	Number of procedural rules loaded
`retrieval_path`	`hybrid_rrf` / `token_recall` / `token_recall_after_embed_error`

Per-turn retrieval​

What the response node sees​

Exercise approach continuity​

Session stage​

Prompt injection by response style​

Diagnostics​