Skip to main content

Tools & Policy

Voice exposes a narrow OpenAI Realtime function-tool surface. The schemas live in agent/voice/tools.py, but the implementation reuses the same app-owned services as the text runtime: memory control, grounded lookup, crisis resources, therapeutic response skills, and guided exercises.

Tool surface

ToolModePurpose
wait_for_userIncognito + persistentStay quiet for silence, background audio, side conversations, or speech not addressed to OpenCouch.
show_memory_statusIncognito + persistentExplain whether memory is enabled and report saved-memory counts.
load_therapeutic_response_skillIncognito + persistentLoad prompt-local therapeutic response-style guidance for ordinary replies.
answer_grounded_lookupIncognito + persistentAnswer explicit current, official, source-backed, or externally verifiable requests from grounded results.
lookup_crisis_resourcesIncognito + persistentFind verified crisis resources for the current crisis turn.
get_crisis_support_templateIncognito + persistentLoad a deterministic crisis-response safety scaffold to shape the crisis reply.
list_guided_exercise_skillsIncognito + persistentList metadata-only guided exercises available for the current channel and approach.
load_guided_exercise_skillIncognito + persistentLoad the runtime-selected guided-exercise skill block for a step/action.
record_guided_exercise_progressIncognito + persistentUpdate active guided-exercise state from the user's latest response.
show_saved_memoryPersistent onlyShow concise saved facts, session summaries, and procedural rules.
recall_saved_memoryPersistent onlyQuery the user's saved memory for facts and session arcs relevant to the current turn.
save_response_preferencePersistent onlySave an explicit response-style or memory-use preference.
set_proactive_memory_recallPersistent onlyEnable or disable proactive memory recall.
prepare_memory_deletion_by_indexPersistent onlyStage deletion of a visible saved-memory item by kind and one-based index.
prepare_memory_deletion_by_queryPersistent onlyStage deletion of a saved-memory item selected by a query.
confirm_memory_deletionPersistent onlyConfirm and perform a pending deletion.
cancel_memory_deletionPersistent onlyCancel a pending deletion.

The tool surface is filtered by memory mode: persistent sessions expose all 17 tools, while incognito sessions expose only the 9 mode-independent tools above. Persistent-only tools are rejected in incognito voice mode. The one exception is show_memory_status, which returns an explicit incognito status so the assistant can explain that durable memory is off.

Intent gate on memory mutations

Beyond mode filtering, the four destructive or preference-changing memory tools — set_proactive_memory_recall, save_response_preference, prepare_memory_deletion_by_index, and prepare_memory_deletion_by_query — carry an additional intent gate. Each must include a user_quote argument, and the backend only lets the mutation proceed if that quote matches something the user actually said in a recent turn (within the last few turns, above a minimum length). If the model cannot cite a matching user utterance, the call is blocked.

This is an anti-misfire guard for a hands-free surface: in a free-flowing speech loop there is no explicit confirm button, so the gate prevents the model from saving a preference or staging a deletion the user never requested.

Realtime-owned turns

Realtime server VAD owns turn detection and response creation for live voice sessions. The session configuration enables automatic response creation after finalized user speech, so the browser does not call a backend policy endpoint before each response.

The backend still owns the app-specific boundaries:

  • Realtime function tools call POST /api/voice/realtime/tools.
  • Tools that require app data, safety resources, memory state, grounded lookup, or guided-exercise state execute through shared backend services.
  • wait_for_user is a no-op signal for silence, background audio, side conversations, or speech not addressed to OpenCouch. Its function output is returned to Realtime without asking the model to continue.
  • When the browser records a completed exchange through POST /api/voice/realtime/turn, the backend infers route and response-style metadata from the actual tool calls that happened in the Realtime session.

Crisis and factual lookups

Two cases should be treated as hard boundaries:

  • Crisis turns that need specific resources must use lookup_crisis_resources; the assistant must not invent hotline names, phone numbers, URLs, or local availability.
  • Factual/current/source-backed requests must use answer_grounded_lookup; the assistant should answer only from the tool result.

Provider failures surface as tool failures rather than silent fallback answers. Verified-but-empty lookup cases return explicit status values so the user gets "I could not verify that" instead of invented content.

Guided exercises

Voice uses the shared guided-exercise catalog but filters to exercises that are suitable for spoken delivery. There are 13 total exercises and 10 voice-supported exercises today:

  • grounding_5_4_3_2_1
  • grounding_box_breathing
  • grounding_stop_technique
  • grounding_muscle_relaxation
  • thought_work_continuum
  • behavioral_activation_tiny_action
  • defusion_values_compass
  • self_compassion_break
  • emotion_regulation_improve
  • emotion_regulation_gratitude

The voice prompt explicitly tells the model not to default to 5-4-3-2-1 grounding unless that exact runtime-selected skill has been provided.

Key files

FilePurpose
agent/voice/tools.pyRealtime function-tool schemas and dispatcher.
agent/voice/turn_metadata.pyRoute and response-style metadata inference from recorded Realtime tool calls.
agent/voice/policy.pyCompact Realtime instruction policy for one voice session.
agent/tools/Shared tool implementations used by both text and voice.
agent/skills/guided_exercises/registry.pyShared exercise catalog and voice filtering.