Use Case Testing
A quick reference for exercising each capability. Send these messages to your running agent to confirm a feature is working end-to-end.
Slash Commands
No configuration needed — these work out of the box.
/helpShould display available commands, session turn count, and session age.
/newShould clear the session and report how many turns were cleared.
To test /stop, send a message that triggers a long tool chain (e.g., a web search + follow-up), then quickly send:
/stopShould respond with "Stopped. The running task has been cancelled."
Web Search
Requires: provider_builtins = true (OpenAI web_search or Gemini googleSearch)
What is the latest version of Julia?
Search the web for recent news about OpenAI.The response should include cited sources. If it answers from training data only (no citations), provider_builtins may be off or the provider doesn't support it.
Web Fetch
Requires: local_builtins = true
Read this page and summarise it: https://julialang.orgThe agent should call web_fetch to retrieve the URL content. If it uses provider-native search instead of fetching the specific URL, the tool selection guidance in AGENTS.md may need tuning.
File Operations
Requires: local_builtins = true, a context/ workspace directory
Create a file called notes.md in the workspace with today's date and a short greeting.
List the files in the workspace.
Read the file notes.md you just created.File tools are sandboxed to context/ by default. A path-escape attempt should be denied.
GitHub
Requires: local_builtins = true, gh CLI installed and authenticated
Show me the open issues on whanyu1212/Krill.jl
How many stars does whanyu1212/Krill.jl have?The agent should use the github tool wrapping gh. Verify it uses --json output for structured results.
Memory
Requires: memory = true, memory_consolidation = true
Remember that my preferred language is Julia and I work on data pipelines.Restart the agent, then:
What do you remember about me?The agent should recall the preference from ~/.krill/memory/<session>/MEMORY.md without it being in the current conversation history.
Cron Scheduling
Requires: cron = true
Test a one-shot reminder (uses at schedule with computed UTC datetime):
Remind me in 2 minutes to check the oven.The agent should create a one-shot job. After 2 minutes, it should send the reminder unprompted.
Test a recurring job (uses interval schedule):
Set a reminder every 1 minute to say "tick".After adding the job, wait a minute — the agent should send "tick" unprompted.
Then manage jobs:
List my scheduled jobs.
Remove all cron jobs.Check ~/.krill/cron/jobs.json to confirm persistence across restarts.
Subagents
Requires: subagents = true
Spawn a subagent to research the history of the Julia programming language and summarise it in three bullet points.The parent session should continue normally. When the subagent finishes, its summary is injected back into the conversation.
Skills
Requires: builtin_skills = true or a custom skill in context/skills/
What skills do you have available?For an always-on skill, its instructions are injected every turn — verify the agent's behaviour matches the skill document. For an on-demand skill:
Use the <skill-name> skill to help me with X.MCP Tools
Requires: a configured [[profile.mcp]] block in krill.toml
List the tools available from the filesystem MCP server.
Use the filesystem MCP server to list files in the context directory.MCP tool names are namespaced as mcp_<name>_<tool> — confirm the tool IDs in the response match the server's declared tools.
Hugging Face MCP
Requires: huggingface MCP server configured with a valid $HF_TOKEN
Search for recent papers on vision-language models.
Find Hugging Face models for text-to-speech.
Search Hugging Face datasets about weather time-series.
How do I use LoRA adapters with PEFT?Validates that the HF MCP tools (mcp_huggingface_*) return results with titles, links, and metadata. The last prompt uses documentation semantic search.
Google Workspace (Gmail)
Requires: google_workspace = true, gws CLI installed and authenticated (gws auth login)
Check my inbox for unread emails.
Send an email to test@example.com with subject "Hello from Krill" and body "This is a test email sent by my AI agent."
Reply to the latest email from LinkedIn saying "Thanks, I'll take a look."The agent should use the google_workspace tool with gws gmail +triage, +send, and +reply commands respectively. Verify the email appears in your Gmail Sent folder.
WARNING
Other Google Workspace services (Calendar, Drive, Sheets, Docs) are supported by the gws CLI but have not been thoroughly tested with Krill. Gmail send/triage/reply is the primary tested workflow. If you use other services, verify the commands work via gws directly first.
ClawHub Skill Registry
Requires: clawhub = true in [profile.tools], internet access to clawhub.ai
Search for a community skill:
Search ClawHub for a skill related to web scraping.The agent should call clawhub_search and return a list of matching skills with slug, description, author, downloads, and stars.
Install a skill through the validation pipeline:
Install the <slug> skill from ClawHub.The agent should call clawhub_install, which downloads the skill to quarantine, runs the validation gate (content scan, metadata check, popularity thresholds), and either promotes it to the verified store or rejects it with reasons. Check ~/.krill/skill_store/ to confirm the directory structure:
~/.krill/skill_store/
├── manifest.json
├── quarantine/ # empty after install completes
└── verified/
└── <slug>/
└── SKILL.mdVerify the installed skill is discoverable:
What skills do you have available?The newly installed skill should appear in the list with source clawhub.
List and remove installed skills:
List my installed ClawHub skills.
Remove the <slug> skill.Test validation rejection by configuring a high min_downloads threshold in [clawhub] (e.g., min_downloads = 999999) and attempting to install a low-download skill — it should be rejected with a reason about the download count.
Shell Exec
Requires: exec = true (disabled by default — enable only in trusted environments)
Run the shell command: echo "hello from exec"
What is the current working directory?Claude Code / Codex Delegation
Requires: claude_code = true or codex = true, CLI authenticated beforehand
Use Claude Code to find all Julia files in the workspace and summarise what each one does.
Delegate to Codex: write a Julia function that computes the nth Fibonacci number and save it to the workspace.These spawn a subprocess. The agent should report the result, cost/token usage, and session/thread ID when complete.
Telegram Formatting
Requires: Telegram channel enabled
Send a message that triggers a table in the response:
Show me a comparison table of Julia, Python, and Rust — columns for typing, speed, and ecosystem size.Tables should render as aligned monospace text in Telegram (inside a <pre> block), not raw pipe characters. Bold, italic, code blocks, and links should also render correctly.
History Summarization
Requires: history_summarization = true
Have a long conversation (20+ exchanges), then:
Summarise what we've discussed so far.When the context window fills, Krill compresses old turns into a summary and continues. Check the session JSONL at ~/.krill/sessions/<session>/history.jsonl — older turns will be replaced by a summary entry.