Tracking Costs

Claude Code uses tokens, and tokens cost money. Here’s how to track and optimize your usage.

Check Your Usage

API Users: `/cost`

If you’re paying via API, use /cost to see your current session:

> /cost

Total cost:            $0.55
Total duration (API):  6m 19.7s
Total duration (wall): 6h 33m 10.2s
Total code changes:    0 lines added, 0 lines removed

Subscribers: `/stats`

Claude Max and Pro subscribers have usage included—/cost isn’t relevant for billing. Use /stats to see usage patterns instead.

Understanding Tokens

~4 characters = 1 token (English)
~100 tokens = 75 words
Typical code file = 500-2000 tokens

Cost factors:

Input tokens (what you send)
Output tokens (what Claude generates)
Context (accumulated history)
Extended thinking tokens (billed as output)

Reduce Token Usage

1. Clear Between Tasks

Stale context wastes tokens on every message:

> /rename auth-feature    # Name current session
> /clear                  # Start fresh

Use /resume later to return to named sessions.

2. Compact Proactively

When context grows large:

> /compact                           # Basic summarization
> /compact Focus on API changes      # Preserve specific topics

3. Choose the Right Model

Model	Cost	Best For
Haiku	$	Quick questions, simple tasks, CI/CD
Sonnet	$$	Daily coding, most tasks
Opus	$$$$	Complex architecture, hard problems

> /model haiku
> what's the syntax for a list comprehension?

4. Be Specific

Vague prompts need more back-and-forth:

# Expensive
> fix the bug
> no, the other bug
> in auth

# Cheaper
> fix the password validation bug in auth.py line 45

5. Reduce MCP Overhead

Each MCP server adds tool definitions to context, even when idle:

> /context                 # See what's consuming space
> /mcp disable postgres    # Disable unused servers

Prefer CLI tools over MCP when available. Tools like gh, aws, gcloud are more context-efficient—they don’t add persistent tool definitions.

Tool search tuning: When MCP tools exceed 10% of context, Claude auto-defers them via tool search. Lower the threshold for more savings:

ENABLE_TOOL_SEARCH=auto:5 claude    # Trigger at 5% instead of 10%

6. Adjust Extended Thinking

Extended thinking improves complex tasks but uses output tokens. For simpler tasks:

# Disable in /config, or set lower budget:
MAX_THINKING_TOKENS=8000 claude

Token Usage by Activity

Activity	Typical Tokens
Simple question	500-1,000
Explain a file	2,000-5,000
Bug fix	5,000-15,000
Feature implementation	20,000-50,000
Full project scan	50,000-200,000

Team Cost Management

For API teams, set workspace spend limits in the Console.

Rate limit recommendations (TPM per user):

Team Size	TPM/User
1-5 users	200k-300k
5-20 users	100k-150k
20-50 users	50k-75k
50-100 users	25k-35k

Community Tools

For more detailed tracking, ccusage is an optional community tool:

npx ccusage              # Today's usage
npx ccusage --days 7     # Last week

Quick Reference

Command	Purpose
`/cost`	Session cost (API users)
`/stats`	Usage patterns (subscribers)
`/context`	What’s consuming context
`/compact`	Summarize to save tokens
`/clear`	Start fresh session
`/model`	Switch to cheaper model

For more cost optimization strategies, see Cost & Performance.