0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

Qwen 2.5 72B Instruct
vs
Qwen 3 72B Instruct
Qwen 2.5 72B InstructA

Qwen 2.5 72B Instruct

72B params · 131K context · qwen

Cheapest providerdeepinfra
$/1M input$180000.00
$/1M output$350000.00
Qwen 3 72B InstructB

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai
$/1M input$220000.00
$/1M output$880000.00
Specs and cheapest providers
SpecQwen 2.5 72B InstructQwen 3 72B Instruct
Parameters72B72B
Context window131K tokens131K tokens
Licenseqwenqwen
Released2024-09-192025-04-28
Cheapest provider
Providerdeepinfrafireworks-ai
Input / 1M tokens$180000.00🏆$220000.00
Output / 1M tokens$350000.00🏆$880000.00

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider
Qwen 2.5 72B Instruct
$1600000.00 /mo
Qwen 3 72B Instruct
$2860000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$267500.00 · $440000.00
5M in · 2M out$1600000.00 · $2860000.00
20M in · 10M out$7100000.00 · $13200000.00
100M in · 60M out$39000000.00 · $74800000.00

Capability vs price

scatter
// scatter: benchmark × $/1M out
Calculate cost for your workload

Compare total monthly cost across providers for Qwen 2.5 72B Instruct and Qwen 3 72B Instruct using your own input/output token mix.

Open workload calculator →
Editor's take
Qwen 3 72B Instruct is the direct successor to Qwen 2.5 72B Instruct; both are dense 72B models from Alibaba, but the third-generation model ships with meaningful improvements. Qwen 3 72B pushes MMLU from ~86 (2.5) to ~89–90, gains a 128K context window (vs 32K for 2.5), and improves on math and coding benchmarks — HumanEval moves from ~86% to ~90%+. The price premium for Qwen 3 72B is real but modest: roughly $0.80–$1.40/M tokens vs $0.60–$1.10/M for Qwen 2.5 72B across major providers. For teams already running Qwen 2.5 72B, the migration path is straightforward — same tokenizer family, compatible prompt formats — so switching involves minimal integration overhead. **Where Qwen 2.5 72B wins:** workloads where the lower price is the primary driver and existing MMLU ~86 quality is already meeting SLA. It has broader provider availability and is more likely to be available on legacy GPU SKUs with proven uptime records. **Where Qwen 3 72B wins:** any workload pushing the boundaries of the 2.5 generation — longer documents that exceed 32K tokens, math and reasoning tasks where the accuracy delta matters, or new deployments where there's no sunk cost in the older model. Pick [Qwen 2.5 72B Instruct](/models/alibaba--qwen-2.5-72b-instruct) when you're already on it and the cost savings outweigh the quality increment. Pick [Qwen 3 72B Instruct](/models/alibaba--qwen-3-72b-instruct) for new deployments or any workload where the 128K context window or improved reasoning benchmarks pay off.
Related comparisons
Full model details