Features
A practical guide to what Krill does and how to use each part.
Agent Runtime
Everything starts with RuntimeState. It wires together the channel, provider, tools, memory, and all other subsystems into a single running process.
Pass everything directly as llm_* keyword arguments (flat API):
rt = RuntimeState(channel;
llm_provider = OpenAIProvider(api_key=ENV["OPENAI_API_KEY"], model="gpt-5.4"),
system_prompt = "You are a helpful assistant.",
workspace = "context",
data_dir = joinpath(homedir(), ".krill"),
)
start!(rt)
wait()
shutdown!(rt)Or compose an Agent struct first and pass it directly (Agent API):
agent = Agent(OpenAIProvider(api_key=ENV["OPENAI_API_KEY"], model="gpt-5.4");
system_prompt = "You are a helpful assistant.",
workspace = "context",
hooks = AgentHooks(
on_tool_call = (name, args) -> @info "Tool called" tool=name,
on_tool_result = (name, res) -> @info "Tool result" tool=name chars=length(res),
),
retry = RetryConfig(max_retries=5, base_delay_s=1.0),
)
rt = RuntimeState(channel, agent)
start!(rt)
wait()
shutdown!(rt)Both paths are equivalent. Agent is more convenient when you want to share, inspect, or test the config separately from channel setup.
| Parameter | What it does |
|---|---|
channel | Where messages come from and go to (Telegram, Discord) |
llm_provider / provider | Which LLM to use — OpenAIProvider or GeminiProvider |
system_prompt | Base instructions prepended to every conversation |
workspace | Agent file sandbox — bootstrap docs and skills live here |
data_dir | Krill state directory — sessions, memory, cron (~/.krill by default) |
See Architecture for the full component picture.
Hooks
AgentHooks lets you inject callbacks at five points in the agent loop without modifying any core code. All hooks are optional; failures are caught, logged with @warn, and never propagate.
hooks = AgentHooks(
on_turn_start = (msg, history) -> nothing, # before each LLM call
on_turn_end = (msg, history) -> nothing, # after response is stored
on_tool_call = (name, args) -> nothing, # before each tool dispatch
on_tool_result = (name, result) -> nothing, # after successful dispatch
should_interrupt = (name, args) -> false, # return true to stop the loop
)| Hook | Signature | When it fires |
|---|---|---|
on_turn_start | (msg, history) -> nothing | Before the LLM call, after typing indicator |
on_turn_end | (msg, history) -> nothing | After response stored, before next turn |
on_tool_call | (tool_name, arguments) -> nothing | Before each tool is dispatched |
on_tool_result | (tool_name, result_text) -> nothing | After a successful tool dispatch |
should_interrupt | (tool_name, arguments) -> Bool | Before each tool; true stops the loop |
When should_interrupt returns true, the current tool and any remaining tools in the same LLM batch are skipped and the loop exits with whatever response text was accumulated so far.
Pass via Agent(provider; hooks=...) or llm_hooks=... on RuntimeState.
Retry
RetryConfig controls how Krill retries failed LLM API calls. By default Krill retries up to 3 times with exponential backoff; RetryConfig lets you tune or replace that policy.
retry = RetryConfig(
max_retries = 5,
base_delay_s = 1.0,
max_delay_s = 60.0,
multiplier = 2.0,
jitter = true, # randomise 0.75–1.0× to avoid thundering herd
retriable_status_codes = Set([408, 429, 500, 502, 503, 504, 529]),
)Sleep formula: clamp(base_delay_s × multiplier^(attempt-1) × jitter_factor, 0, max_delay_s)
Pass via Agent(provider; retry=...) or llm_retry=... on RuntimeState. When omitted, Krill falls back to the per-provider max_retries / retry_base_seconds fields.
Tools
Tools are what the LLM can actually do. Krill has two classes.
Local built-ins
Enabled with llm_enable_builtin_tools=true. Always available regardless of provider.
| Tool | What it does |
|---|---|
read_file, write_file, edit_file, list_dir | File operations inside workspace |
web_fetch | Fetch a specific URL as markdown |
github | Wraps the gh CLI |
message | Send a message to a chat ID (only registered when a send function is available) |
exec | Shell commands — opt-in via llm_builtin_enable_exec=true |
google_workspace | Wraps the gws CLI for Gmail, Calendar, Drive — opt-in via llm_enable_google_workspace=true |
claude_code | Delegate a task to Claude Code CLI — opt-in via llm_enable_claude_code=true |
codex | Delegate a task to Codex CLI — opt-in via llm_enable_codex=true |
By default, file tools are restricted to workspace. Set llm_builtin_restrict_to_workspace=false to allow access outside it.
Provider built-ins
Pass provider-native tools via llm_tools. These run on the provider's infrastructure and are significantly more capable than local equivalents for web search and code execution.
# OpenAI
llm_tools = [
Dict("type" => "web_search"),
Dict("type" => "code_interpreter", "container" => Dict("type" => "auto")),
]
# Gemini
llm_tools = [
Dict("googleSearch" => Dict{String,Any}()),
Dict("urlContext" => Dict{String,Any}()),
Dict("codeExecution" => Dict{String,Any}()),
]For most bots, enable both: provider tools for web search quality, local tools for file access and GitHub. Use claude_code or codex for multi-step research or coding tasks.
Provider tools not yet enabled
Both providers offer additional built-in tools that Krill doesn't enable by default. I may come back to these later, but most are not high-value items for a personal assistant use case right now.
| Tool | Provider | Value | Notes |
|---|---|---|---|
image_generation | OpenAI | High | "Generate an image of X" is a natural request. Easy to enable on the API side, but Krill's pipeline currently assumes text-only responses — displaying images in Telegram/Discord requires changes to parsing, message types, and channel senders. |
googleMaps | Gemini | Medium | Useful for location queries ("restaurants near me", "directions to X"). Has compatibility constraints — can't combine with googleSearch, codeExecution, or urlContext in the same request. The agent loses other tools when maps is active. _sanitize_gemini_tools in parsing.jl already handles this. |
file_search | Both | Low | Requires pre-uploading documents to vector stores (OpenAI) or file search stores (Gemini). Not useful unless you set up a knowledge base. Could be valuable later for searching over a document library. |
computerUse | Gemini | None | Krill runs headless — there's no screen to interact with. |
mcp (remote) | OpenAI | None | OpenAI's server-side MCP. Krill already has local MCP support which is more flexible. |
MCP Servers
MCP lets you connect any external tool server — databases, calendars, custom APIs — and expose its tools to the LLM alongside the built-ins.
Configure servers in krill.toml:
[[profile.mcp]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "context"]
[[profile.mcp]]
name = "huggingface"
transport = "streamable_http"
url = "https://huggingface.co/mcp"
[profile.mcp.headers]
Authorization = "Bearer $HF_TOKEN"At startup, Krill calls initialize and list_tools on each server and registers the results into the local ToolRegistry under namespaced IDs: mcp_<name>_<tool>.
The Hugging Face MCP server provides semantic search across ML papers, models, datasets, Spaces, and HF documentation — no local Node.js process needed since HF hosts the server remotely.
| Transport | When to use |
|---|---|
stdio | Local process (npx, uvx, etc.) — most common |
streamable_http | Remote HTTP MCP server |
sse | Legacy HTTP+SSE — use only if the server requires it |
Note: Julia has no official MCP SDK. Krill's client implements JSON-RPC initialize / list / call from scratch. It handles the common cases but may have edge-case issues with non-standard servers. See Known Limitations.
Skills
Skills are markdown documents that inject reusable instructions into the prompt. They live in context/skills/<name>/SKILL.md and are discovered automatically at startup.
context/
skills/
github/
SKILL.md
cron/
SKILL.mdEach SKILL.md has a YAML frontmatter header:
---
name: github
description: "Interact with GitHub using the github tool (gh CLI wrapper)."
requires_bins: gh
---
# GitHub
Use the `github` tool to run `gh` CLI commands...| Field | Meaning |
|---|---|
name | Skill identifier — must match the directory name |
description | Shown to the LLM in the skills summary; primary trigger for loading the skill |
always | true — full body injected into every turn; false — loaded on-demand via read_skill tool |
requires_bins | Comma-separated binaries; skill marked unavailable if any are missing from PATH |
requires_env | Comma-separated env vars; skill marked unavailable if any are unset |
Enable with llm_enable_builtin_skills=true.
Built-in Skills
Krill ships with the following skills out of the box:
| Skill | Always-on | Description |
|---|---|---|
memory | Yes | Two-layer memory system guidance — when and how to update MEMORY.md and search HISTORY.md |
github | No | Patterns for the github tool: issues, PRs, CI status, repo info, API queries, --json/--jq output |
cron | No | Scheduling guide for cron_add/cron_list/cron_remove with schedule type reference |
weather | No | Free weather APIs (wttr.in, Open-Meteo) with no API key required |
google-workspace | No | Gmail, Calendar, Drive patterns for the google_workspace tool — send, triage, reply, API commands |
skill-creator | No | Guide to creating new skills — anatomy, frontmatter, bundled resources, design principles |
Always-on skills (memory) are injected into every system prompt. The rest are loaded on-demand when the LLM calls read_skill("github") etc.
ClawHub Integration
The agent can search and install skills from ClawHub, a public registry with 3,200+ community skills. Enable with clawhub = true in [profile.tools].
Unlike a raw npx clawhub install (which drops files directly into the workspace with no checks), Krill's built-in ClawHub tools route every skill through a quarantine → validation → verified store pipeline:
ClawHub API (untrusted source)
↓
clawhub_install tool
↓
Download ZIP → ~/.krill/skill_store/quarantine/{slug}/
↓
Validation gate
├─ Content scan (run(), ENV[], @eval, ccall, shell blocks, ...)
├─ Metadata check (SKILL.md exists, has description)
├─ Popularity thresholds (configurable min downloads/stars)
└─ Allow/blocklist (by slug or author)
↓
Pass → ~/.krill/skill_store/verified/{slug}/ → available via read_skill
Fail → rejected, quarantine cleaned up, failure reasons reportedFour tools are registered when ClawHub is enabled:
| Tool | What it does |
|---|---|
clawhub_search | Vector similarity search across the registry |
clawhub_install | Download → quarantine → validate → promote or reject |
clawhub_remove | Remove a skill from the verified store |
clawhub_list | List installed skills with status, version, author |
Example conversation flow:
User: "Find me a skill for PDF processing"
→ LLM calls clawhub_search("PDF processing")
→ Returns 5 matching skills with metadata
→ LLM calls clawhub_install(slug="pdf-toolkit-pro")
→ Downloaded, quarantined, validated...
→ ✅ Promoted to verified store
→ Skill immediately available via read_skill("pdf-toolkit-pro")Verified skills are discovered as a third source alongside workspace and builtin skills, with lowest precedence (workspace > builtin > clawhub). A workspace skill with the same name always wins.
ClawHub skills are also subject to prompt injection hardening beyond the validation gate: their descriptions are not injected into the system prompt (replaced by a static (third-party, on-demand) marker), always: true in their frontmatter is ignored, and content returned via read_skill is wrapped in an explicit untrusted-content frame. See Security for details.
Community skills use the same SKILL.md format. Some may reference tool names from other frameworks — minor edits may be needed to map to Krill's tool names.
No API key is required for searching and installing public skills. See Configuration for the full [clawhub] config section and Security for details on the validation gate.
Creating Custom Skills
Use the skill-creator skill for guidance on writing your own:
context/skills/my-skill/
├── SKILL.md # Required — frontmatter + instructions
└── references/ # Optional — detailed docs loaded on-demand
└── api.mdKey design principles:
Concise — the context window is shared; only include what the LLM doesn't already know
Progressive disclosure — metadata always loaded, body on-demand, resources only when needed
Specific triggers — put "when to use" info in the
descriptionfield, not the body
Prompt Context
On every turn, Krill assembles the system prompt from multiple sources in order:
system_promptfromRuntimeStateBootstrap docs from
workspace—SOUL.md,AGENTS.md,USER.md,TOOLS.mdSkill metadata summary (names + descriptions of available skills)
Always-on skill bodies (
always: true)Session memory from
MEMORY.mdTool-output safety notice
Runtime metadata — channel name, session key, chat ID, user ID, UTC timestamp
This is assembled fresh every turn, so changes to bootstrap docs or skills take effect immediately without restarting.
Slash Commands
Users can send slash commands directly in the chat. These are handled by the session consumer before reaching the LLM.
| Command | Description |
|---|---|
/help | Show available commands and current session info (turn count, age) |
/new | Clear the session history and start fresh |
/stop | Interrupt the currently running LLM task |
/remember <fact> | Save a fact to the user's global memory profile (see Global Memory below) |
Memory
Krill has a two-layer memory system: session memory (per-chat, automatic) and global memory (per-user, explicit).
Session Memory
Per-session durable memory that persists across restarts. Scoped to a session key (e.g. telegram:123), so each chat has its own isolated memory store.
Enable with llm_enable_memory=true and llm_enable_memory_consolidation=true.
How it works:
After each turn, a consolidation process scans new history entries
It calls the LLM to extract durable facts and merge them into
MEMORY.mdProcessed history is archived to
HISTORY.mdsoMEMORY.mdstays compactOn the next turn,
MEMORY.mdis injected into the system prompt as## Session Memory
Files written under ~/.krill/memory/<session>/:
| File | Purpose |
|---|---|
MEMORY.md | Live consolidated memory — injected each turn |
HISTORY.md | Archived consolidation batches |
state.json | Offsets and failure tracking |
Global Memory
Cross-channel user profile that persists across sessions and channels. Keyed by user_id (not session key), so the same user is recognised whether they message from Telegram, Discord, or any other channel.
Enable by passing llm_global_memory_store to RuntimeState.
How it works:
Users write facts explicitly with
/remember <fact>The LLM merges the new fact into the existing profile — deduplicating, resolving contradictions, and reorganising into coherent sections
The updated profile is saved to
MEMORY.mdfor that userOn every turn, the profile is injected into the system prompt as
## User Profile, above session memory
Files written under ~/.krill/global_memory/<user_id>/:
| File | Purpose |
|---|---|
MEMORY.md | Live user profile — injected every turn across all sessions |
Prompt injection order (when both layers are enabled):
[Base system prompt]
[Bootstrap docs]
[Skills]
## User Profile ← global memory (cross-channel)
## Session Memory ← session memory (per-chat)
[Tool safety notice]
[Runtime metadata]Cron
Schedule jobs that fire as synthetic messages back into the runtime — the LLM handles them like any other inbound message.
Enable with llm_enable_cron=true. The LLM can then use cron_add, cron_list, and cron_remove tools.
Three schedule types:
| Type | Example | Meaning |
|---|---|---|
at | at 2026-04-01T09:00:00 | One-shot at a specific time |
every | every 30m | Repeating interval |
| cron expression | 0 9 * * 1-5 | Standard 5-field cron |
Jobs persist to ~/.krill/cron/jobs.json and survive restarts.
Subagents
A subagent is an isolated background LLM task spawned mid-conversation. The parent session continues; when the subagent finishes, its result is injected back as a message.
Enable with llm_enable_subagents=true.
User: "Research X and summarise it"
→ LLM calls spawn_subagent(task="Research X")
→ Subagent runs in its own session with its own tool loop
→ Result injected into parent session when done
→ LLM continues with the resultKey behaviours:
Subagents get their own isolated session — history doesn't bleed across
Spawn tools are omitted inside subagents to prevent recursive spawning
Concurrent subagent limit is enforced to avoid runaway parallelism
Channels
Telegram
channel = TelegramChannel(TelegramClient(token); allow_from=["123456789"])Long polling via
TelegramChannelHTTP webhook via
TelegramWebhookChannelNormalises text, media, callback queries, and channel posts into
InboundMessageallow_fromcontrols which user IDs can interact (["*"]to allow all)
Discord
channel = DiscordChannel(DiscordClient(token); allow_from=["*"])Gateway client for
MESSAGE_CREATEeventsREST client for sending and typing indicators
Long messages split automatically on outbound
Markdown formatter converts tables → code blocks, headings → bold, HRs → Unicode lines
Known Limitations
MCP client
Krill's MCP client is built from scratch with no official Julia SDK. Known edge cases:
Partial SSE frames — servers that don't strictly follow the SSE spec may cause parse errors
Stdio mid-call exit — reconnect fires but the in-flight result is lost
Non-standard error shapes — the client may surface a generic error instead of the server's structured one
Session continuity — HTTP re-initialization after reconnect may break servers that tie context to session state
Open an issue with the raw JSON-RPC exchange and server name if you hit one of these.
Memory
No memory retrieval tool — the full
MEMORY.mdis dumped into context every turn; the LLM can't search or query it selectivelyNo memory size cap — if
MEMORY.mdgrows large, it eats into the context window with no automatic pruningSession memory consolidation quality depends on the LLM — the summarizer may drop facts the user considers important, or retain noise
Global memory is explicit-only — the user must invoke
/remember <fact>; the LLM does not write to global memory automatically
Telegram rendering
Krill converts markdown to Telegram HTML before sending — bold, italic, strikethrough, code blocks, inline code, links, and headings all translate. Tables are converted to aligned monospace <pre> blocks. Known rough edges:
Complex tables — Tables with very wide columns or mixed-width content may not align well on mobile screens
Nested formatting in tables — Bold/italic inside table cells is stripped (the table is rendered as plain text inside
<pre>)Long messages — Telegram has a 4096-character limit per message; Krill does not currently split long responses automatically (Discord does)
HTML fallback — If Telegram rejects the HTML (malformed tags from unusual LLM output), Krill retries as plain text, losing all formatting
Current boundaries
Telegram and Discord only — more channels are roadmap
Entry points enable all tools by default — no per-tool UI yet
context/is a plain directory, not a versioned or migrated store