Skip to main content
Back to Tools

Anthropic API Pricing Calculator: Claude Haiku, Sonnet 4.6 & Opus 4.8 (2026)

Calculate your real monthly Anthropic API bill with discount toggles for prompt caching (up to 90% off input) and batch API (50% off everything). Compare against GPT-4o at the same token volume.

Updated June 12, 2026 · By Jim Liu

Quick Answer

  • At 1M input + 200K output/month: Haiku 4.5 = $0.80, Sonnet 4.6 = $3.04, Opus 4.8 = $15.04 (no discounts applied).
  • Prompt caching can reduce your input bill by up to 90% — the biggest Anthropic pricing hack most developers miss entirely.
  • Batch API cuts everything in half, but requires async processing with a 24-hour turnaround SLA.
  • Claude Sonnet 4.6 with caching often beats GPT-4o on price for coding workloads with repeated system prompts — effective input cost drops to $0.30/M vs GPT-4o's $2.50/M.

Claude API Pricing Calculator

Best for coding and analysis · Context: 200K tokens

System prompt + user messages + context

Generated responses, code, completions

Enable Prompt CachingUp to 90% off input
Enable Batch API50% off everything
Monthly Cost
$6.00
1.00M input + 0.20M output
Input cost$3.00
Output cost$3.00
vs GPT-4o at same volume
$4.50GPT-4o saves $1.50

GPT-4o at $2.50/M input + $10.00/M output, no caching or batch applied.

Claude Model Lineup 2026: Haiku, Sonnet, Opus — What Each Tier Gets You

Claude Haiku 4.5

Best Value
Fastest and most affordable
Input$0.80/M tokens
Output$4.00/M tokens
Cache write$1.00/M
Cache read$0.08/M

Best for: High-volume simple tasks — summarization, classification, metadata extraction. At $0.80/M input, 10M daily requests costs $8/day.

Claude Sonnet 4.6

Most Popular
Best for coding and analysis
Input$3.00/M tokens
Output$15.00/M tokens
Cache write$3.75/M
Cache read$0.30/M

Best for: Software engineering — code generation, review, and refactoring with test coverage. The default choice for most production Claude deployments.

Claude Opus 4.8

Highest Quality
Most capable, agentic tasks
Input$15.00/M tokens
Output$75.00/M tokens
Cache write$18.75/M
Cache read$1.50/M

Best for: Agentic pipelines — multi-step reasoning, complex analysis, autonomous research. Use sparingly: 5x the Sonnet cost per token.

Prompt Caching: The 90% Input Discount No One Talks About

Mark parts of your prompt as cacheable — system prompt, tool definitions, large document context. The first call writes to cache at a slight premium (~125% of input price); every subsequent call pays the cache read price: ~10% of standard input price.

Real example: AI coding assistant with a 50K-token system prompt. Without caching: $150 per 1M requests. With caching after the first call: ~$16 per 1M requests — 90% reduction on those input tokens.

Enable: cache_control: { type: "ephemeral" } on the content blocks to cache. TTL is 5 minutes, refreshed on each call that includes the cached block.

ModelInput priceCache writeCache readRead vs input
Claude Haiku 4.5$0.80/M$1.00/M$0.08/M10% of input
Claude Sonnet 4.6$3.00/M$3.75/M$0.30/M10% of input
Claude Opus 4.8$15.00/M$18.75/M$1.50/M10% of input

Anthropic Batch API vs Real-Time: When 50% Savings Is Worth the Wait

Anthropic's Message Batches API offers a flat 50% discount on all tokens in exchange for async processing with a 24-hour maximum SLA.

Good for: nightly reports, content pipelines, dataset annotation, bulk summarization. Not for: chatbots, real-time tools, anything user-facing.

API: anthropic.beta.messages.batches.create(requests) — each item needs custom_id + params. Retrieve with batches.results(batch_id).

At 1M input + 200K output per month, no caching
ModelReal-timeBatch API (50% off)Savings
Claude Haiku 4.5$1.60$0.8000$0.8000
Claude Sonnet 4.6$6.00$3.00$3.00
Claude Opus 4.8$30.0$15.0$15.0

Frequently Asked Questions

At an average of 500 input + 500 output tokens per call, 1,000 calls = 0.5M input + 0.5M output. That costs $1.50 + $7.50 = $9.00 at Sonnet 4.6 standard pricing. With batch API this drops to $4.50.
Sponsored

Ad served by Adsterra. OpenAIToolsHub is not responsible for advertiser content.