alibaba · qwen familyVerified May 18, 2026

Qwen 3 72B Instruct

72B params · 131K context · qwen · released 2025-04 · 5 providers

Alibaba's Qwen 3 72B Instruct is the most credible alternative to Llama 3.3 70B in the same parameter class. Released in 2025, it ships with stronger multilingual support (particularly for Chinese, Japanese, Korean, and Arabic) and slightly better performance on long-context retrieval tasks. Pricing on hosted providers is broadly similar to Llama 3.3 70B — typically $0.25–$0.40 per 1M input tokens, $0.50–$0.90 per 1M output. Worth choosing over Llama for non-English-heavy workloads, code generation in non-English comments, or multilingual chat applications. For English-only deployments the choice between the two is largely a coin flip; pick whichever has more favorable pricing on your preferred provider.

Cheapest right now

DeepInfra$0.45/1M out

Go to DeepInfra ↗Compare

sticky while scrolling · verified May 18, 2026

Cheapest now

$0.45

DeepInfra · fp16 · $/1M out

90-day price Δ

—

cheapest-host trend

Fastest TTFT

400 ms

Fireworks AI · 230 tok/s

Providers

indexed providers

✓ verified May 18, 2026

#Provider$/1M in$/1M outTTFT p50tok/sQuant

1DeepInfraCHEAPEST$0.23$0.45435 ms195fp16go ↗

2Fireworks AIFASTEST$0.22$0.88400 ms230fp16go ↗

3Together AI$0.29$0.29——unknowngo ↗

4OpenRouter$0.27$0.85——unknowngo ↗

5Hyperbolic$0.40$0.40520 ms100fp16go ↗

Publisheralibaba

Parameters72B

Context window131k tokens

Licenseqwen

Released2025-04-28

Familyqwen

Pricing by quantization

Provider	Input / 1M	Output / 1M	Tok/s
DeepInfra	$0.2300	$0.4500	195
Fireworks AI	$0.2200	$0.8800	230
Hyperbolic	$0.4000	$0.4000	100

Questions developers ask

How much does it cost to run Qwen 3 72B Instruct for 100M tokens?▾

Running Qwen 3 72B Instruct with 100M input and 10M output tokens per month costs approximately $27.50 on DeepInfra, the cheapest available provider as of the latest pricing data. Costs vary significantly depending on your input/output ratio and whether you use prompt caching.

What is the cheapest provider for Qwen 3 72B Instruct?▾

DeepInfra currently offers Qwen 3 72B Instruct at the lowest total cost for a standard workload. Prices change frequently — check the provider table above for the latest data.

What context window does Qwen 3 72B Instruct support?▾

Qwen 3 72B Instruct supports a context window of 131,072 tokens. Individual providers may cap this lower.

What's the cheapest way to run Qwen 3 72B Instruct?▾

The cheapest way to run Qwen 3 72B Instruct is via DeepInfra, starting at $0.23 per million input tokens. If your workload is prompt-heavy, enabling prompt caching can reduce costs further.

Is there a free tier for Qwen 3 72B Instruct?▾

Free tiers vary by provider and change frequently. Check each provider's current pricing page for trial credits or free-tier limits. The prices shown on this page reflect paid API access.

How much does Qwen 3 72B Instruct cost for 1M tokens?▾

Qwen 3 72B Instruct input pricing starts at $0.23 per million tokens on DeepInfra. Output tokens are typically priced 2–4× higher than input tokens depending on the provider.

Keep exploring

Cheapest on Fireworks AI →Compare vs Qwen 2.5 72B Instruct →Compare vs Qwen 2.5 Coder 32B Instruct →Compare vs Qwen 2.5 Coder 7B Instruct →See full ranking: Cheapest input →See full ranking: Fastest TTFT →Used in the Customer support chatbot workload →Price history →

Methodology: prices scraped nightly from public pricing pages; snapshots append-only. Read methodology · Public API · CSV export · OG image per model.

Prices verified May 18, 2026 by scraping 5 providers. Methodology · Raw data

Provider	Input / 1M	Output / 1M	Tok/s
OpenRouter	$0.2700	$0.8500	—
Together AI	$0.2900	$0.2900	—

Qwen 3 72B Instruct

Similar models

Run it locally instead?

Questions developers ask

Keep exploring