Save 30-60% on Claude Code costs with an installable skill, web tools, and 11 deep-dive guides.
Claude Code (official plugin system):
/plugin marketplace add Sagargupta16/claude-cost-optimizer
/plugin install cost-mode@sagargupta16-claude-cost-optimizerMulti-agent (Cursor, Cline, Codex, 40+ agents):
npx skills add Sagargupta16/claude-cost-optimizerThen activate in any session:
/cost-mode # Standard (40-60% output token reduction)
/cost-mode lite # Professional brevity (20-40% output reduction)
/cost-mode strict # Telegraphic, max savings (60-70% output reduction)
/cost-mode off # Resume normal behavior
| Feature | How It Saves Tokens |
|---|---|
| Strips filler | Drops pleasantries, hedging, restating your question, trailing summaries |
| Suggests cheaper models | "Haiku handles this -- /model haiku" for simple tasks |
| Suggests CLI tools | "Use prettier/eslint --fix directly" instead of burning LLM tokens |
| Session awareness | Reminds to /compact after 20+ turns, fresh sessions for new tasks |
| Minimal code gen | Diffs over rewrites, no obvious comments, no speculative error handling |
| Auto-deactivates | Full clarity for security warnings, destructive ops, and when you're confused |
Technical accuracy is never sacrificed. Code in commits and PRs is written normally.
No install needed -- use these in your browser:
| Tool | What It Does |
|---|---|
| Repo Analyzer | Paste a GitHub URL to get a cost audit, grade (A+ to F), and recommendations |
| Cost Calculator | Estimate monthly spend based on your model, sessions, and config |
| Badge Checker | Score your setup and get a shields.io badge for your repo |
Real project, 30-turn session, Opus 4.6:
BEFORE (no optimization): AFTER (5 minutes of setup):
CLAUDE.md: 6,200 chars (truncated) CLAUDE.md: 2,800 chars (under limit)
.claudeignore: missing .claudeignore: 12 patterns
MCP servers: 6 active MCP servers: 2 active
System prompt: ~15,000 tokens/turn System prompt: ~5,500 tokens/turn
Session cost: $2.85 Session cost: $1.12
Monthly (3x/day): $188.10 Monthly (3x/day): $73.92
Savings: $114.18/month (61%)
Even without installing the skill, these 5 changes cut costs immediately:
| # | Strategy | Savings | Guide |
|---|---|---|---|
| 1 | Keep CLAUDE.md under 4,000 characters -- content beyond 4K is silently truncated | 10-20% | Context Optimization |
| 2 | Use Haiku for simple tasks (--model haiku) -- 5x cheaper than Opus |
20-40% | Model Selection |
| 3 | Use Plan Mode before coding -- prevents wasted iterative cycles | 15-25% | Workflow Patterns |
| 4 | Add .claudeignore -- stop Claude from reading node_modules, dist, lock files |
5-15% | Context Optimization |
| 5 | Delegate to subagents -- isolate expensive searches from main context | 20-40% | Workflow Patterns |
Full walkthrough: Getting Started in 5 Minutes
cost-mode is the first skill. More are planned:
| Skill | Status | What It Does | Cost Impact |
|---|---|---|---|
| cost-mode | Live | Concise responses, model routing suggestions, session awareness | 30-60% cost savings |
| claudeignore-gen | Planned | Auto-generates .claudeignore based on your project's tech stack | 5-15% input reduction |
| context-compress | Planned | Rewrites your CLAUDE.md to be shorter while keeping all essential info | 10-20% input reduction |
| cache-optimizer | Planned | Detects cache-busting patterns and suggests fixes to maximize prompt cache hits | 10-25% input reduction |
| budget-guard | Planned | Per-session and per-day spending limits with warnings before you blow past them | Prevents overspend |
Want to build one? Skills are just SKILL.md files -- see CONTRIBUTING.md and the skills/cost-mode/ directory for the pattern.
11 deep-dive guides covering every optimization area:
| Guide | What You'll Learn |
|---|---|
| 00 - Getting Started | Zero to optimized in 5 minutes -- the essential setup |
| 01 - Understanding Costs | How billing works, what costs the most, where money goes |
| 02 - Context Optimization | Reduce input tokens: CLAUDE.md, .claudeignore, file reads |
| 03 - Model Selection | When to use Opus vs Sonnet vs Haiku (with decision tree) |
| 04 - Workflow Patterns | Plan mode, subagents, commands, batch operations |
| 05 - Team Budgeting | Per-developer budgets, cost tracking, ROI calculation |
| 06 - Access Methods & Pricing | Compare API vs Bedrock vs Vertex AI vs Claude Code pricing |
| 07 - MCP & Agent Cost Impact | MCP server overhead, subagent costs, Agent SDK patterns |
| 08 - Prompt Caching Deep Dive | Cache mechanics, TTL economics, maximizing hit rates, ROI math |
| 09 - Subscription Plan Value | Choose the right plan, maximize allowance, upgrade/downgrade signals |
| 10 - Three-Tier Task Routing | Skip the LLM for Tier 0 tasks, route cheap tasks to Haiku, save Opus for complex work |
Also: Visual Diagrams (Mermaid flowcharts) | One-Page Cheatsheet
Copy-paste configs that are already optimized:
CLAUDE.md: Minimal | Standard | Comprehensive | Monorepo
By Stack: React+Vite | Next.js | FastAPI | MERN | Terraform | Go | Rust | Django | Rails | Spring Boot
Settings: Cost-Conscious | Balanced | Performance-First
Commands: /cost-check | /budget-mode | /quick-fix | /optimize
7 tools for measuring, tracking, and reducing costs. Full tools documentation
| Tool | What It Does |
|---|---|
| Token Estimator | Estimate token count and cost for any file |
| Usage Analyzer | Find cost hotspots across your sessions |
| Badge Generator | Grade your project config (A+ to F) from the CLI |
| MCP Cost Server | In-session cost estimation via MCP |
| VS Code Extension | Token count and cost in the status bar |
| GitHub Action | Automated cost audit on PRs |
| Budget Hooks | Track tool calls, log costs, warn at thresholds |
| Model | Input / 1M | Output / 1M | Cache Hit / 1M | Context | Max Output |
|---|---|---|---|---|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | 1M | 32K |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | 64K |
| Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | 64K |
Fast Mode: 6x. Long context (>200K input): 2x input, 1.5x output. Batch API: 50% off. Plans: Pro $20/mo, Max 5x $100/mo, Max 20x $200/mo.
- Task Comparison -- same task, optimized vs not
- Model Comparison -- Opus vs Sonnet vs Haiku
- Context Size Impact -- how CLAUDE.md size affects cost
- Community Leaderboard -- crowdsourced cost-per-task data
- Case Studies -- real-world optimization stories
- GitHub Discussions -- ask questions, share strategies
- Contributing -- from starring the repo to building new skills
- Issue Templates -- tips, benchmarks, case studies
| Project | What It Does | Savings |
|---|---|---|
| caveman | Brevity skill -- strips filler from responses | 50-75% output |
| claude-mem | Compresses session context for handoff | Input reduction |
| claudetop | htop-style monitoring with cost tracking | Visibility |
How much does Claude Code actually cost?
With Pro ($20/mo), Max 5x ($100/mo), or Max 20x ($200/mo), you get included usage. Heavy users report $3-15/day without optimization, $1-5/day with it. Opus 4.6 at $5/$25 is much more affordable than previous generations.
Does this apply to the Claude API too?
Context optimization, model selection, and prompt engineering apply to both. The skill and commands are Claude Code-specific.
Will optimization reduce output quality?
No. These strategies eliminate waste (duplicate context, unnecessary file reads, expensive models for simple tasks). Quality stays the same or improves -- less noise means better reasoning.
What's the biggest single change I can make?
Install cost-mode (npx skills add Sagargupta16/claude-cost-optimizer) and switch to Haiku for routine tasks. Combined: 50-70% savings.
If this repo helped you save money, consider giving it a star!
If you found this useful, check out my other AI/Claude tools:
| Project | Description |
|---|---|
| claude-code-recipes | 50+ copy-paste recipes for Claude Code - commands, subagents, hooks, skills |
| claude-skills | Custom Claude Code plugin marketplace with dev-workflow, FARM stack, and more |
| agent-recipes | AI agent workflows for real-world dev tasks - code review, testing, security |
| ai-git-hooks | AI-powered git hooks - auto-review diffs, generate commit messages, security scanning |
| mcp-toolkit | Production-ready middleware for MCP servers - auth, caching, rate limiting |
MIT - use these strategies, templates, and tools however you want.