Skip to main content

API Reference

The FastAPI app mounts all application routes under /api.

Run locally:

cd apps/backend
.venv/bin/python -m uvicorn main:app --port 8000 --reload

Text chat

RouteMethodPurpose
/api/healthGETHealth check
/api/chatPOSTRun one full text turn and return the completed response
/api/chat/streamWebSocketRun one text turn and stream status, chunks, and final response

POST /api/chat and /api/chat/stream both accept a chat request with message, thread_id, optional user_id, and optional response_model_tier. Reuse the same thread_id to continue a conversation. Reuse the same user_id across thread ids to share long-term memory.

Threads

RouteMethodPurpose
/api/threadsGETList known text threads
/api/threads/{thread_id}/stateGETDebug/internal. Return raw implementation state for a thread
/api/threads/{thread_id}/historyGETReturn user/assistant transcript turns
/api/threads/{thread_id}/session-statusGETReturn active-session tracking status
/api/threads/{thread_id}/endPOSTFinalize a text session and persist session-end memory
/api/threads/{thread_id}/feedbackPOSTRecord post-session feedback without re-finalizing the session (body: feedback, optional memory_mode, modality: text|voice)
Debug state endpoint

/api/threads/{thread_id}/state powers the local State Inspector and mirrors the TUI's /debug state command. It returns raw runtime implementation state, including transcript, memory, safety, routing, and diagnostics fields. It is useful for development and dogfooding, but product clients should use typed endpoints such as /history, /session-status, /memory/*, and /chat/stream.

Text session finalization returns the same stable envelope shape as voice finalization:

{
"finalized": true,
"summary": "Session summary text",
"detail": "Session finalized.",
"themes": ["stress", "sleep"],
"mood_opened": "tense",
"mood_closed": "calmer",
"turn_count": 4,
"open_loops": [],
"resolved_threads": []
}

When no durable summary is produced, finalized is false, summary is null, list fields are empty, and detail explains why.

Memory

RouteMethodPurpose
/api/memory/statusGETReturn memory counts, store totals, and recall state
/api/memory/recallPATCHEnable or disable proactive memory recall
/api/memory/factsGETList semantic facts
/api/memory/sessionsGETList episodic session arcs
/api/memory/rulesGETList procedural style rules
/api/memory/facts/{index}DELETEDelete one semantic fact by displayed index
/api/memory/sessions/{index}DELETEDelete one episodic arc by displayed index
/api/memory/rules/{index}DELETEDelete one procedural rule by displayed index

Memory endpoints are scoped by thread_id, optional user_id, and optional memory_mode. In incognito mode, user-memory reads return empty counts/lists for semantic facts, episodic sessions, and procedural rules. Saved-memory mutation endpoints reject with 409 and a structured detail payload:

{
"detail": {
"code": "incognito_memory_mutation_unavailable",
"message": "Saved-memory controls are unavailable in incognito mode."
}
}

Audit-oriented counts such as crisis logs and session feedback may still be non-zero because those stores are always-on and privacy-scrubbed in incognito mode.

Voice

RouteMethodPurpose
/api/voice/realtime/sessionPOSTCreate an OpenAI Realtime client secret for browser WebRTC voice
/api/voice/realtime/toolsPOSTExecute one app-owned Realtime function tool call
/api/voice/realtime/turnPOSTPersist a finalized voice user/assistant turn in app-owned history
/api/voice/realtime/endPOSTFinalize a persistent voice session through the runtime session finalizer

Voice is OpenAI Realtime-native. The browser owns WebRTC audio, while the backend owns session configuration, memory bootstrap, function-tool execution, route/style inference during turn persistence, and end-session memory finalization.

Session creation request:

{
"thread_id": "web-voice-abc123",
"user_id": "alice",
"memory_mode": "persistent",
"assistant_voice": "marin"
}

assistant_voice is optional. When omitted (or null), the backend applies the default (alloy). The value is normalized (trimmed, lower-cased) and must be one of the ten supported Realtime voices: alloy, ash, ballad, cedar, coral, echo, marin, sage, shimmer, verse. An unsupported name is rejected.

Tool execution request:

{
"thread_id": "web-voice-abc123",
"user_id": "alice",
"memory_mode": "persistent",
"current_user_message": "Can you look up the official guidance?",
"transcript": [
{"role": "user", "content": "Can you look up the official guidance?"}
],
"tool_name": "answer_grounded_lookup",
"arguments": {"query": "official guidance"}
}

Turn recording request:

{
"thread_id": "web-voice-abc123",
"user_id": "alice",
"memory_mode": "persistent",
"user_text": "Can you look up the official guidance?",
"assistant_text": "I found the official source...",
"tool_calls": [
{
"tool_name": "answer_grounded_lookup",
"status": "completed",
"output": {"response_text": "I found the official source..."}
}
]
}

The backend infers the recorded route and response_style from the tool calls that occurred during the Realtime turn.

End-session request:

{
"thread_id": "web-voice-abc123",
"memory_mode": "persistent",
"feedback": "positive"
}

Both /api/threads/{thread_id}/end and /api/voice/realtime/end accept an optional feedback label (positive / negative / skip) that is written to the session-feedback store before summarization. Omit it (or send null) to skip the feedback step.

Voice end-session responses use the same finalized, summary, detail, and session-arc envelope documented for text sessions.

Client contracts

The text chat response schema exposes the user-visible text plus routing metadata:

FieldMeaning
response_textAssistant message
response_typePublic category: therapeutic or crisis
response_styleMore specific style or operational branch, such as supportive, memory_control, grounded_lookup, or crisis_response
therapeutic_approachTherapeutic approach overlay when applicable
crisisNormalized crisis assessment
session_actionUI hint: suggest_end_session when the assistant produced a closing reply, otherwise none
diagnosticsPer-turn timings and routing metadata

The WebSocket stream emits status, chunk, done, and terminal error events:

{"type": "status", "stage": "loading memory", "detail": ""}
{"type": "chunk", "text": "That sounds heavy."}
{"type": "done", "response": {"response_text": "..."}}
{"type": "error", "code": "agent_turn_failed", "message": "The turn could not be completed."}

Frontend clients should treat error as terminal and display message to the user.

WebSocket clients receive the assistant's final text through chunk events (incremental) and the done payload (complete). The runtime's internal stream also produces a response_ready event, but the WebSocket handler does not forward it — it is consumed by the TUI to render the reply early. Integrators should not wait for a response_ready message over the socket.