Hybrid Retrieval
Reciprocal Rank Fusion (RRF) combines two scorers so the agent handles both exact-name queries and paraphrased queries well.
|query ∩ record| / |query|. Keep ≥ 0.33.score = Σ 1/(k + rank), k=60Why hybrid?
Neither scorer alone is robust enough for therapy content:
| Scorer | Wins on | Loses on |
|---|---|---|
| Token-recall | Proper nouns ("Sarah"), medication names ("fluoxetine"), short queries, multilingual text (CJK, Cyrillic, accented Latin) | Stemming ("anxiety" vs "anxious"), synonyms ("sibling" vs "sister"), paraphrase |
| Embedding | Stemming, synonyms, semantic paraphrase ("I feel stuck" vs "things feel hopeless") | Proper nouns (name signal diluted), short queries (too little context) |
| Hybrid RRF | Both — fuses by rank position, not raw score | Nothing significant |
The tokenizer uses \b\w+\b (Python 3 Unicode-aware) with a CJK
character-splitting post-processor. Chinese, Japanese, Korean, Cyrillic,
and accented Latin text all produce meaningful token sets for both dedup
and retrieval. CJK characters are emitted individually (standard for CJK
IR without a word segmenter), covering BMP and astral-plane Extensions
B through H.
RRF's constant k=60 requires no per-dataset tuning (Cormack et al.
2009).
Fallback paths
| Scenario | What happens |
|---|---|
| No embedding provider | Pure token-recall (the pre-embedding behavior) |
| Embedding API failure | Logged, falls back to token-recall for this turn |
| Record has no embedding | Participates in token-recall only |
| Model mismatch | Record skipped in embedding scan |
The retrieval_path diagnostic reports which path ran:
"hybrid_rrf", "token_recall", or "token_recall_after_embed_error".
Episodic date filter
Query-based episodic retrieval applies a max_age_days=30 filter —
session arcs older than 30 days are excluded from the search results.
This keeps the agent focused on recent context.
The first-turn catch-up path (alatest) is not date-filtered — the
most recent session summary always appears regardless of age. This
ensures every new session opens with continuity even if the user hasn't
visited in months.
Embedding storage
| Column | Type | Purpose |
|---|---|---|
embedding | BLOB | float32 array via struct.pack |
embedding_dim | INTEGER | Dimensionality validation |
embedding_model | TEXT | Model migration detection |
Default provider: Gemini gemini-embedding-001 (3072 dims). Falls back
to NullEmbeddingProvider when no API key is set.
Eval harness
# Token-recall baseline (no API key needed)
uv run python eval/runners/retrieval_eval.py --mode token-only
# Compare all three scorers (requires API key)
uv run python eval/runners/retrieval_eval.py --mode hybrid --verbose
17 hand-curated cases across 6 categories. Output is a scorer comparison matrix showing recall@1 / recall@5 per category.