Runtime & Persistence
Two text-runtime layers, one pipeline. The stateless layer is for
single-turn requests. The persistent layer adds thread-aware
checkpointing, candidate buffering, an inactivity sweeper, and the
shared session-end path (/end, timeout, shutdown, API end-session).
Two layers
AgentOutputresponse + crisis + diagnosticsBoth layers run the same runtime stages in the same order. The persistent layer saves app-owned state through the configured backend after each turn and restores reducer-backed fields on the next.
The browser owns OpenAI Realtime WebRTC audio, while the backend owns
session configuration, memory bootstrap, tool execution, turn recording,
and session finalization. Voice does not drive full text turns
through run_turn / run_turn_stream; finalized Realtime transcripts
are recorded through record_voice_turn(...), and persistent voice
sessions end through the same end_session(...) summarization path as
text. The 20-min inactivity sweeper remains text-session oriented.
See Voice for the full picture.
How state accumulates
get_history() on the hot pathbuild_initial_state() emits only the current user turn. The
checkpointer restores prior turns automatically via reducers. This
eliminated the O(n) transcript deserialization that ran on every
persistent turn.
| Field | Reducer | Behavior |
|---|---|---|
history | operator.add | Each turn appends new entries; checkpointer accumulates |
transcript | operator.add | Turn finalization appends a 1-element delta |
session_progress | _merge_dicts | turn_count increments while preserving sibling fields |
session_memory | _merge_dicts | Summary / active concerns / open loops merge across turns |
procedural_profile | _merge_dicts | Procedural rules + recall toggle merge |
exercise_state | _merge_dicts | Active exercise continuity; exercise_therapeutic_approach pins the approach used when guidance resumes |
memory_control | _merge_dicts | pending_action carries destructive deletes across turns |
diagnostics | _merge_dicts | Runtime stages and side effects write independently; gate timings, retrieval path, and write counts merge |
Non-reducer fields (crisis, crisis_audit, route,
response_text, response_style, therapeutic_approach, the
turn-scoped lookup fields) are overwritten fresh each turn — they
describe a single turn's decisions and replies.
Thread lifecycle
# Session 1 — 3 turns + end
$ opencouch --thread-id alice-s1 --user-id alice
> Hi there # turn 1: checkpoint created
> I've been feeling anxious # turn 2: transcript accumulates
> Can we do a grounding exercise? # turn 3: exercise state persists via progress reducer
> /end # feedback prompt → summarize → episodic arc written
# Session 2 — same user, new thread
$ opencouch --thread-id alice-s2 --user-id alice
> Hey # first-turn catch-up fires: "Last session (anxiety)..."
# alice's semantic facts + procedural rules visible| Event | What happens |
|---|---|
| First turn | Checkpoint created. build_initial_state() provides defaults; opencouch_active_sessions row registers the active session. |
| Subsequent turns | Checkpointer restores accumulated state. Only the new user turn is emitted. The session row's last_active_at updates. |
/end | Optional feedback prompt → record_session_feedback() → end_session() → summarize_session triggers service-backed episodic summarization/persistence → commit_session_memory triggers service-backed promotion of held candidates → active-session row deleted. |
| 20-min inactivity | Background sweeper finds expired session rows and runs the same end_session() flow with the runtime's default LLM client. Held candidates and episodic arcs are still written even if the user never typed /end. |
| Process shutdown | __aexit__ best-effort finalizes anything still open (when finalize_active_sessions_on_close=True, the default). |
Resume after /end | Same thread_id works. Transcript persists. Next turn starts a fresh session and a fresh candidate buffer. |
| Incognito | All four backends in-memory. Nothing touches disk. Crisis log + feedback still record (ephemeral). The active-session table is skipped. |
The --user-id flag
# Without --user-id: memory scoped to thread
$ opencouch --thread-id thread-a # facts written to "thread-a" namespace
$ opencouch --thread-id thread-b # can't see thread-a's facts
# With --user-id: memory scoped to user across threads
$ opencouch --thread-id s1 --user-id alice # facts written to "alice"
$ opencouch --thread-id s2 --user-id alice # sees alice's facts from s1Identity and thread fallbacks
PersistentAgentRuntime does not generate thread ids. Its text turn
methods require callers to pass thread_id, and the runtime carries
that value as session_id inside runtime state. Defaults live at the
caller boundary:
| Surface | Missing thread_id | Missing user_id | Memory owner |
|---|---|---|---|
| Runtime API | No runtime fallback; caller must provide one | Accepted as None | user_id if set, otherwise session_id (thread_id) |
| HTTP / WebSocket text API | Request validation fails | Accepted as None | user_id if set, otherwise thread_id |
| CLI text | Generates local-<12 uuid hex> | Persistent mode falls back to the active thread id; guest mode ignores --user-id | user_id if set, otherwise active thread_id |
| Web UI | Blank setup field generates web-<8 random base36> | Blank persistent setup uses web-user; incognito clears user_id | user_id if set, otherwise thread_id |
| Web voice | Reuses the active web thread_id from setup | Persistent mode uses the active web user id; incognito clears user_id | user_id if set, otherwise active thread_id |
The generated thread id is never derived from the user id. The fallback
goes the other direction: when no stable user_id is supplied, memory
ownership falls back to the thread/session id so each thread stays
isolated by default.
WorkflowContext
Runtime dependencies injected as a frozen dataclass. Runtime stages
access via runtime.context.llm_client — not dict access.
@dataclass(slots=True, frozen=True)
class WorkflowContext:
llm_client: BaseLLMClient | None # control-plane LLM (safety, routing, session finalization)
memory_store: MemoryStore # unified read/write across semantic / episodic / procedural
crisis_log_backend: CrisisLogBackend # always-on audit trail
memory_mode: MemoryMode # INCOGNITO / LOCAL / SYNCED
response_llm: BaseLLMClient | None = None # optional response-writing LLM; falls back to llm_client
embedding_provider: EmbeddingProvider | None = None # for hybrid retrieval and write-time indexing
session_memory_buffer: SessionMemoryBuffer | None = None # held candidates until session end
A convenience property control_llm returns llm_client for
stages that just want "the safety / routing / memory model"
without caring whether a separate response model is configured.
Immutability guarantees that no stage can accidentally modify a
shared dependency during a turn. The slots=True flag reduces
memory overhead. Both are free correctness wins.
Active session recovery
PersistentAgentRuntime keeps an opencouch_active_sessions table
in the configured active-session backend. Each row carries the
thread's session_buffer (held semantic / procedural candidates),
max_crisis_level, and transcript_start_index for the current
session. A 20-minute inactivity sweeper auto-finalizes expired
sessions, and __aexit__ best-effort finalizes anything still open
on shutdown — so held candidates and session-end summarization
trigger reliably even if the user never types /end.
In INCOGNITO mode the same flow runs entirely in memory: the
checkpoint and active-session tracking are process-local and no durable
session row is written.
Key files
| File | Purpose |
|---|---|
agent/runtime/runtime.py | PersistentAgentRuntime — run_turn, run_turn_stream, end_session, record_session_feedback, sweeper, active-session recovery |
agent/runtime/turn.py | build_initial_state, state_to_output, run_agent |
agent/state.py | All state fragments + AgentGraphInputState / AgentState / AgentGraphOutputState |
agent/runtime/workflow_context.py | WorkflowContext frozen dataclass |
agent/audit/crisis_log.py | CrisisLogBackend protocol + in-memory implementation |
agent/audit/postgres_crisis_log.py | Primary Postgres crisis log with retention purge |
agent/audit/sqlite_crisis_log.py | SQLite crisis-log fallback backend |
agent/feedback/session_feedback.py | SessionFeedbackBackend protocol + in-memory implementation |
agent/feedback/postgres_session_feedback.py | Primary Postgres feedback store |
agent/feedback/sqlite_session_feedback.py | SQLite feedback fallback backend |