Skip to main content

Crisis Gate

First node in the graph. Every message passes through it before memory loads, before routing, before response generation. Cannot be skipped.


Architecture

LLM-primary with deterministic overrides and regex fallback:


Decision paths

PathWhen it firesWhat it does
1. OverridesAlways runs first (instant, no network)Imminent-risk patterns → level 3. Idiomatic-safe patterns → level 0. Safety-denial after a check → level 0.
2. LLM classifierPrimary path for all non-override messagesStructured output with full conversation context. Handles negation, sarcasm, quoted speech, ambiguity.
3. Regex ladderOnly when LLM unavailable or call failsDeterministic fallback using pattern categories below. Provides degraded but functional safety coverage.

The LLM path also runs a shadow deterministic assessment in parallel. Disagreements are logged for drift monitoring — if regex flags level 2+ but the LLM returns level 0, a warning is emitted.


Pattern categories

CategoryLevelExample
Imminent risk3Plan + means + timing ("tonight I'm going to...")
Clear self-harm2Unambiguous ideation ("I want to kill myself", "kms")
Ambiguous1Possible risk needing clarification ("I can't do this anymore")
Distress1Severe distress without explicit self-harm ("hopeless", "breaking point")
Idiomatic safe0Benign hyperbole ("work is killing me", "I'm dead 💀")
Safety denial0De-escalation after a safety check ("I'm safe", "just venting")

Level truth table

Normalization enforces this table regardless of what the classifier returned:

Levelneeds_crisis_responseneeds_clarificationRoute
0falsefalseTherapeutic
1falsetrueTherapeutic (with safety check in response)
2truefalseCrisis response
3truefalseCrisis response

Route decision

RouteConditionPipeline
crisisneeds_crisis_response = truecrisis_response → crisis_log → finalize
therapeuticneeds_crisis_response = falseload_memory → therapeutic_subgraph → finalize

Expressed as Command(goto=...) — the only branching node in the graph.


Privacy asymmetry

Crisis log writes regardless of memory mode:

In incognitoBehavior
user_id_or_nullNone — no identity persisted
session_id_opaqueSHA-256 hash, no reverse mapping
Event recorded?Yes — safety audit trail preserved

Retention: 90 days. /memory purge-crisis [days] enforces the window (exclusive boundary — cutoff date itself preserved).


Diagnostics

KeyValue
crisis_gate_msWall-clock time for the full assessment
crisis_classifier_pathoverride / llm_primary / deterministic
crisis_levelNormalized level (0–3)
crisis_shadow_deterministic_levelWhat the regex ladder would have returned (shadow monitoring)

Design rules

RuleWhy
Response pipeline waits for the gateSafety sequencing > latency
Overrides fire before any network callImminent risk cannot wait 1–2s for LLM
LLM is primary, not regexHandles negation, context, sarcasm, quoted speech
Shadow monitoring logs disagreementsDetects LLM drift without blocking the response
Normalization enforces truth tablePrevents miscalibrated LLM from wrong-flagging
42-case eval datasetCovers imminent risk, clear self-harm, idiomatic-safe, boundary cases

Key files

FilePurpose
agent/nodes/crisis_gate.pyOverride detection, LLM classifier, deterministic fallback, normalization
agent/nodes/crisis_response.pyPFA-overlay response + web-searched local resources
agent/nodes/crisis_log.pyAlways-on audit record writer
agent/memory/crisis_log.pyBackend protocol + in-memory + null
agent/memory/sqlite_crisis_log.pySQLite backend with 90-day retention
eval/runners/crisis_gate_eval.pyDeterministic eval runner