What it does
LLM coding assistants tend to wrap answers in long, polite prose. The bulk of the tokens go to articles, connectives, and stock phrases — not the actual signal. That eats through 5-hour limits and API budgets faster than it has to.
Caveman rewrites the model’s output as caveman speech — short fragments, dropped articles, no filler — so the same information lands in roughly 25–35% of the original tokens.
Features
- Intensity levels —
/caveman lite(filler removed, grammar kept),/caveman full(default — fragments, articles dropped),/caveman ultra(max compression, abbreviations) /caveman wenyan— Classical Chinese (文言文) mode for further compression/caveman-commit— terse conventional commit messages (≤50 chars)/caveman-review— one-line PR comments with precise line numbers/caveman-stats— session and lifetime token usage / savings/caveman-compress— rewrites memory and doc files, ~46% input token reductioncavecrewsubagents — investigator, builder, reviewer- Statusline savings badge — live session savings shown in the statusline
caveman-shrinkMCP middleware — compresses tool descriptions before they enter context
Example
| Mode | Output | Tokens |
|---|---|---|
| Normal | ”The reason your React component is re-rendering is likely because you’re creating a new object reference on each render cycle…” | ~69 |
| Caveman | ”New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo.” | 19 |
Supported AI clients
Claude Code, Gemini CLI, Codex, Cursor, Windsurf, Cline, GitHub Copilot, Continue, Kilo, Roo, Augment, Aider, Amp, Goose, JetBrains Junie, Kiro CLI, OpenHands, opencode, Tabnine, Trae, Warp, Replit Agent, Antigravity, and 40+ others.
Before / After
Before: 60–90+ output tokens per answer — 5-hour limits and API budgets drain quickly.
After: /caveman and the same answer lands in 19–30 tokens — sessions stretch further on the same budget.
How to activate
After install, trigger Caveman with any of:
- Slash commands:
/caveman,/caveman lite|full|ultra,/caveman wenyan - Natural language: “talk like caveman”, “less tokens please”
Turn it off with “stop caveman” or “normal mode”.