Data Notice: The numerical data presented in this article are sourced from the most current provider data at publication and may reflect prior-period or projected statistics. Confirm all pricing and features with the provider directly.

AI API Pricing Comparison: Cost Per Million Tokens

How We Evaluated: Our editorial team researched AI API Pricing Comparison using published API pricing, cost-per-task calculations, and rate limit data. Rankings reflect cost per million tokens, rate limits, free tier availability, and volume discounts. Last updated: March 2026. See our editorial policy for full methodology.

AI API pricing changes frequently as providers compete on cost and capability. This page compares current pricing across all major providers in a single, easy-to-reference table. Bookmark it and check back regularly for updates.

AI API Input Pricing: Cost per 1M Tokens (March 2026)

Opus 4 $15.00

o3 $10.00

Sonnet 4 $3.00

GPT-4o $2.50

Gemini Pro $1.25

Haiku 4 $0.25

Flash $0.075

Our ai api pricing comparison: cos assessments incorporate public benchmarks and editorial testing. Actual performance depends on your particular use case and configuration.

Complete Pricing Table (March 2026)

Premium Tier Models

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context Window	Notes
Claude Opus 4	Anthropic	$15.00	$75.00	200K	Strongest reasoning
o3	OpenAI	$10.00	$40.00	200K	+ thinking token costs
Gemini Ultra	Google	$7.00	$21.00	1M+	Largest context window

Mid-Tier Models (Best Value)

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context Window	Notes
Claude Sonnet 4	Anthropic	$3.00	$15.00	200K	Best quality/cost ratio
GPT-4o	OpenAI	$2.50	$10.00	128K	Strong generalist
Mistral Large	Mistral	$2.00	$6.00	128K	Good multilingual
Gemini Pro	Google	$1.25	$5.00	1M+	Great value with large context

Budget Tier Models

Model	Provider	Input (per 1M tokens)	Output (per 1M tokens)	Context Window	Notes
o3-mini	OpenAI	$1.10	$4.40	200K	Budget reasoning
Claude Haiku 4	Anthropic	$0.25	$1.25	200K	Fast, very cheap
GPT-4o mini	OpenAI	$0.15	$0.60	128K	Budget general purpose
Gemini Flash	Google	$0.075	$0.30	1M+	Cheapest capable model

All prices as of March 2026. Check provider websites for the latest pricing.

Cost Per Common Task

Task	Approximate Tokens	Opus 4	Sonnet 4	GPT-4o	Haiku 4	Flash
Single question + answer	500 in / 300 out	$0.03	$0.006	$0.004	$0.0005	$0.0001
Blog post generation	500 in / 1,500 out	$0.12	$0.024	$0.016	$0.002	$0.0005
Document summary (20 pages)	15K in / 500 out	$0.26	$0.053	$0.043	$0.004	$0.001
Code review (large file)	5K in / 2K out	$0.23	$0.045	$0.033	$0.004	$0.001
Full book analysis	150K in / 2K out	$2.40	$0.48	N/A*	$0.04	$0.01

GPT-4o’s 128K limit means it cannot process a full book in one pass.

Cost Reduction Features

Prompt Caching (Anthropic)

Anthropic offers prompt caching that reduces the cost of repeated context (system prompts, reference documents) by up to 90%. This is significant for applications that reuse the same context across many queries.

Cache write: 1.25x base input cost
Cache read: 0.1x base input cost (90% discount)

Batch Processing

Several providers offer discounts for non-time-sensitive batch processing:

Provider	Batch Discount	Turnaround
Anthropic	50% off	Within 24 hours
OpenAI	50% off	Within 24 hours
Google	Varies	Varies

Volume Discounts

Enterprise agreements with committed usage can reduce pricing further. Contact providers directly for volume pricing.

Price Trends

AI API pricing has fallen dramatically:

Year	Cost for GPT-4-class 1M Output Tokens	Reduction
2023	~$60.00	Baseline
2024	~$15.00	75% reduction
2025	~$10.00	83% reduction
2026	~$5-10.00	85-92% reduction

Expect continued price reductions as inference efficiency improves and competition intensifies.

Choosing the Right Tier

Use premium models when:

Complex reasoning, analysis, or coding is required
Accuracy on difficult problems justifies the cost
The cost of errors exceeds the cost of using a better model

Use mid-tier models when:

You need good quality for general tasks
You want the best quality-to-cost ratio
Volume is moderate (hundreds to thousands of queries/day)

Use budget models when:

Tasks are simple (classification, extraction, routing)
Volume is very high (millions of queries/day)
Speed matters more than maximum quality
You are building a first pass before human review

Read: AI Costs Explained

Key Takeaways

AI API pricing spans a 100x range from Gemini Flash ($0.075/1M input) to Claude Opus 4 ($15/1M input).
Mid-tier models (Claude Sonnet 4, GPT-4o) offer the best quality-to-cost ratio for most applications.
Prompt caching and batch processing can reduce costs by 50-90% for eligible workloads.
Prices have dropped 85-92% since 2023 and continue falling.
Output tokens are always more expensive than input tokens (2-5x), so controlling output length saves money.

Next Steps

Calculate your estimated monthly spend: AI Cost Calculator: Estimate Your Monthly API Spend.
Understand tokens and pricing in depth: AI Costs Explained: API Pricing, Token Limits, and Hidden Fees.
Count tokens in your content: Token Counter Tool: Paste Text, See Token Count.
Compare subscription plans as an alternative to API pricing: ChatGPT Plus vs Claude Pro vs Gemini Advanced: Subscription Comparison.

The comparisons in this guide are for informational purposes and reflects our editorial team’s independent analysis. Capabilities of AI tools used for this topic change often — verify the latest details with each platform.