Skip to main content

Architecture

Single Python backend. Explicit LangGraph agent graph. Three memory layers backed by SQLite. CLI-first development.


Turn pipeline

Safety-first ordering

The crisis gate is the first node in the graph. Memory only loads on the therapeutic branch — if a message triggers a crisis response, memory retrieval is skipped entirely.

1
crisis_gate_nodeEvery message — no exceptions. LLM-primary classifier with deterministic overrides and regex fallback.
Command(goto=...) routes to one branch:
crisis path
2a
crisis_response_nodePFA overlay + web-searched local hotlines
3a
crisis_log_nodeAudit trail — writes regardless of memory mode
therapeutic path
2b
load_memory_nodeHybrid RRF retrieval across 3 namespaces
3b
therapeutic_subgraphDispatcher → 1 of 7 response styles × 7 approaches
both paths converge
4
finalize_turn_nodeAppend response to transcript via operator.add reducer. No I/O — no retry.
parallel fan-out
5
extract_semantic_facts_nodeCandidate extraction → deterministic write policy
5
extract_procedural_rules_nodeStyle rules → immediate commit or session-end hold
session end/end · timeout · shutdown · voice disconnect
6
summarize_sessionEpisodic arc for cross-session continuity
7
commit_session_memoryPromote held semantic + procedural candidates

Every I/O node has RetryPolicy(max_attempts=2) as defense-in-depth.


Key decisions

DecisionChoiceWhy
ExecutionLangGraph StateGraphExplicit branches, deterministic ordering, checkpoint persistence
LLMBaseLLMClient protocolProvider-agnostic — Gemini + OpenAI satisfy the same interface
EmbeddingEmbeddingProvider protocolGemini text-embedding-004 default; null provider for offline
StorageSQLite via aiosqliteSingle-file, no server, survives restart
MemoryMemoryStore protocolIn-memory for incognito/tests; SQLite for persistent
RetrievalHybrid RRFEmbedding cosine + token-recall fused via Reciprocal Rank Fusion
Prompt sourcesagent/prompts/sources/*.md filesReviewed prompt fragments; composed at runtime
ContextWorkflowContext frozen dataclassAttribute access, type-safe, immutable per turn
Reducersoperator.add + _merge_dictsTranscript accumulation + parallel diagnostics
ObservabilityLangSmith + local diagnosticsLangSmith for trace-level debugging and evaluation review; in-CLI diagnostics for per-turn visibility
Crisis logAlways-onPrivacy asymmetry — incognito scrubs user_id but still records

Persistence

Four SQLite databases

Each under .store/, each owning its schema independently. Paths overridable via CLI flags. Incognito mode uses in-memory backends for all four — nothing touches disk.

DatabaseOwnerWhat it persistsRetention
threads.sqlite3LangGraph checkpointerConversation state snapshotsIndefinite
memory.sqlite3SqliteMemoryStoreSemantic facts, episodic arcs, procedural profilesUser-controlled
crisis.sqlite3SqliteCrisisLogBackendCrisis event audit trail90 days
session_feedback.sqlite3SqliteSessionFeedbackBackendEnd-of-session thumbs ratings180 days

Prompt layers

Six layers composed per turn, outermost first. Click a layer to see its source.

See Prompt Assembly for the full composition logic.


Package layout

PackageOwns
agent/Graph entrypoint, state schema, models, runtime context
agent/nodes/One file per graph node (8 nodes)
agent/memory/Store protocol, embeddings, retrieval, dedup, hashing
agent/therapeutic/Subgraph: dispatcher, seven response style nodes, prompts
agent/tools/Web search for crisis resources
services/llm/BaseLLMClient + Gemini + OpenAI clients
opencouch_cli/Rich-based interactive CLI
voice/OpenAI Realtime voice sessions
api/FastAPI routes (chat, threads, memory)
tests/630+ pytest tests
eval/5 eval harnesses with curated datasets

TopicPage
Agent graphGraph
Node catalogNodes
ToolsTools
State schemaState
MemoryMemory
Crisis gateCrisis Gate
RuntimeRuntime
ObservabilityObservability
PrivacyPrivacy