OpenCouch
A mental health support agent that remembers you across sessions, checks for safety before every response, and adapts its style to how you prefer to be supported.
OpenCouch moves quickly. Docs stay close to the code, but some pages may lag briefly after larger refactors or dogfood changes.
The system at a glanceโ
Every turn enters through the same safety gate. If the gate stays silent, an LLM-primary triage picks one of four routes โ and only that route runs. After the reply is sealed, an off-turn extract step decides whether anything from the exchange is worth remembering.
The OpenCouch Pipeline
Select a route to trace the chronological execution of a single turn.
Therapeutic
Autonomous Agent ยท TherapeuticAgentThe primary conversational engine. Evaluates clinical approaches and generates empathetic, style-matched responses.
agent/specialists/therapeutic.pyTriage decisions live in agent/specialists/triage.py.
See the full runtime โ
Text and voice surfacesโ
One product runtime, two live transports. Both share memory, crisis resources, guided exercises, and session persistence โ but voice trades the per-turn pipeline for low-latency Realtime function tools.
| Surface | Runtime |
|---|---|
| Text | CLI, web chat, and chat APIs. One OpenAI Agents SDK turn per message, with safety-first routing, memory loading, streaming status events, and post-response extraction. |
| Voice | Browser speech-to-speech over OpenAI Realtime WebRTC. The backend creates the session, injects compact memory context, executes app-owned function tools, and finalizes persistent sessions through the shared runtime. |
How memory worksโ
Three memory layers, retrieved per turn and loaded into prompts on demand. Most LLM agents implement only semantic memory; OpenCouch implements all three, which is what makes cross-session personalization possible.
Semantic Memory
KNOWS SarahUSES fluoxetineWORRIES_ABOUT work
Episodic Memory
Session 1: panic attacks, did groundingSession 2: work stress and sleep
Procedural Memory
Don't suggest meditationPrefer shorter responses
Retrieved per turn via hybrid search and session catch-up.
Procedural rules loaded as style directives.
Hover a layer to see where it lands in the prompt.
Inspired by CoALA (Cognitive Architectures for Language Agents, Princeton 2023). Memory architecture โ
Three pillarsโ
LLM-only crisis gate runs before anything else, with local truth-table normalization for levels 0โ3. Region-aware hotline lookup overlays verified resources onto crisis replies. Always-on audit log, even in incognito.
An LLM-primary triage picks a response style and a therapeutic approach per turn โ CBT, ACT, DBT skills, MI, IPT, grief, PFA. Guided exercises pin their starting approach in exercise_state so side-turns don't drift the protocol.
Semantic facts, episodic session arcs, and procedural style rules. Extracted with structured LLM output, classified by LLM-primary write policy with local safety guards, retrieved via hybrid search with Reciprocal Rank Fusion.
Under the hoodโ
Per-turn stage timings, classifier paths, retrieval mode, write-policy decisions, and side-effect counters merge through one structured diagnostic channel. Opik captures the full trace.
A 20-minute inactivity sweeper auto-finalizes idle sessions through the same end-session path as /end. Held memory candidates persist across restarts so delayed promotion never silently disappears.
For Postgres-first persistence, privacy controls, and latency tuning, see Backend Architecture and Privacy Controls.
Important to knowโ
OpenCouch is a research and development prototype โ not a substitute for professional mental health care. It is not a licensed therapist, a diagnostic tool, or an emergency service.