Token Optimization Guide
Heavy Claude Code usage can get expensive fast. These strategies reduce token spend by 40–60% without degrading output quality.
Why Token Optimization Matters
Claude charges per token — input and output. A naive agent setup loads every file, every agent definition, and every memory entry into every session. On a busy day that can mean millions of tokens across dozens of tasks.
Optimization Strategies
Context Pruning
Up to 40%Remove stale context before it accumulates. Run /sync mid-session to compress memory. Remove files from context that are no longer relevant.
Selective File Reading
Up to 30%Read only the specific functions or sections you need, not entire files. Prefer grep and targeted reads over full file dumps.
Compact Prompts
Up to 20%Eliminate filler phrases. "Please could you kindly help me with..." → "Fix:". Task descriptions should be dense, not polite.
Model Routing
Up to 60%Use cheaper, faster models for simple tasks (formatting, linting, summarising) and reserve the full model for reasoning-heavy work.
Model Routing Reference
Not every task needs the most capable (and expensive) model. Routing tasks to the right model is the single highest-leverage optimization.
| Model | Best for | Relative cost |
|---|---|---|
claude-haiku-3-5 | Formatting, summarising, retrieval, classification | 1× |
claude-sonnet-4-5 | Coding, debugging, code review, analysis | 5× |
claude-opus-4 | Architecture decisions, complex reasoning, planning | 15× |
Before / After Comparisons
Session start
~90% reductionFile reading
~95% reductionCode review
~60% reductionMeasuring Token Spend
Use the /cco commands to monitor token usage across sessions:
# Show token usage for the current session /cco budget # Show token breakdown by task /cco report # Export usage data for a date range /cco export --from 2026-03-01 --to 2026-03-27
Run /cco report weekly. Look for tasks with disproportionately high input token counts — those are the candidates for optimization first.
Get Token-Optimized by Default
The AI Starter Package is built with token efficiency as a core design principle. Context pruning, model routing, and compact command definitions are pre-configured.
View Pricing