Adaptive tiny-model layer · Apache 2.0

90% of your LLM's tool output
is noise it never cites.

PlanckBot sits between a host LLM and its tools. Observes every call, trains a per-tool LoRA on what the model actually quotes in its next turn, then silently compresses the noisy parts at runtime.

97% token reduction
on list_directory
1.86MB adapter size
per tool
~10min CPU training
from 200 triples
PlanckBot workbench dashboard with live activity feed

Four layers of adaptation

Each layer is a separate switch you can flip on its own. Start at A, keep adding as you trust the data.

A

Observe

Every MCP tool call gets logged as a triple (input, raw output, what the host LLM quoted in its next turn). No interference, no swapping. Just data.

B

Compress

A per-tool LoRA (SmolLM2-135M base) is trained on those triples. At inference, the proxy swaps raw→filtered when the adapter's confidence clears threshold. Claude doesn't know.

C

Edit

The host LLM can propose source-level edits to tool code through a meta-tool that's AST-whitelisted. Every edit bumps the tool's version and invalidates the adapter trained against the old one.

D

Synthesize

A pattern detector watches for repeating tool sequences and proposes merged synthesized tools. AST-gated generation, hot-reloaded through a second MCP server.

60-second quickstart

git clone https://github.com/opcastil11/planckbot && cd planckbot
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"

# First-run setup: creates data dir, migrates DB, registers MCP with Claude Code
planckbot init --upstream-path /path/to/your/project --claude-config

# Launch the workbench
planckbot ui        # http://localhost:8080

Restart Claude Code in that folder. Every filesystem call becomes a triple. After ~200 calls per tool, train the first adapter with planckbot train --tool X --fixture Y.json.

What else is in the box

Run it locally in under a minute

github.com/opcastil11/planckbot