Skip to content

Tracking Costs

Claude Code uses tokens, and tokens cost money. Here’s how to track and optimize your usage.

If you’re paying via API, use /cost to see your current session:

> /cost
Total cost: $0.55
Total duration (API): 6m 19.7s
Total duration (wall): 6h 33m 10.2s
Total code changes: 0 lines added, 0 lines removed

Claude Max and Pro subscribers have usage included—/cost isn’t relevant for billing. Use /stats to see usage patterns instead.

  • ~4 characters = 1 token (English)
  • ~100 tokens = 75 words
  • Typical code file = 500-2000 tokens

Cost factors:

  • Input tokens (what you send)
  • Output tokens (what Claude generates)
  • Context (accumulated history)
  • Extended thinking tokens (billed as output)

Stale context wastes tokens on every message:

> /rename auth-feature # Name current session
> /clear # Start fresh

Use /resume later to return to named sessions.

When context grows large:

> /compact # Basic summarization
> /compact Focus on API changes # Preserve specific topics
ModelCostBest For
Haiku$Quick questions, simple tasks, CI/CD
Sonnet$$Daily coding, most tasks
Opus$$$$Complex architecture, hard problems
> /model haiku
> what's the syntax for a list comprehension?

Vague prompts need more back-and-forth:

# Expensive
> fix the bug
> no, the other bug
> in auth
# Cheaper
> fix the password validation bug in auth.py line 45

Each MCP server adds tool definitions to context, even when idle:

> /context # See what's consuming space
> /mcp disable postgres # Disable unused servers

Prefer CLI tools over MCP when available. Tools like gh, aws, gcloud are more context-efficient—they don’t add persistent tool definitions.

Tool search tuning: When MCP tools exceed 10% of context, Claude auto-defers them via tool search. Lower the threshold for more savings:

Terminal window
ENABLE_TOOL_SEARCH=auto:5 claude # Trigger at 5% instead of 10%

Extended thinking improves complex tasks but uses output tokens. For simpler tasks:

Terminal window
# Disable in /config, or set lower budget:
MAX_THINKING_TOKENS=8000 claude
ActivityTypical Tokens
Simple question500-1,000
Explain a file2,000-5,000
Bug fix5,000-15,000
Feature implementation20,000-50,000
Full project scan50,000-200,000

For API teams, set workspace spend limits in the Console.

Rate limit recommendations (TPM per user):

Team SizeTPM/User
1-5 users200k-300k
5-20 users100k-150k
20-50 users50k-75k
50-100 users25k-35k

For more detailed tracking, ccusage is an optional community tool:

Terminal window
npx ccusage # Today's usage
npx ccusage --days 7 # Last week
CommandPurpose
/costSession cost (API users)
/statsUsage patterns (subscribers)
/contextWhat’s consuming context
/compactSummarize to save tokens
/clearStart fresh session
/modelSwitch to cheaper model

For more cost optimization strategies, see Cost & Performance.