Model crosswalk
Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.
Qwen 2.5 72B Instruct
vs
Qwen 3 72B Instruct
Qwen 2.5 72B InstructA
Qwen 2.5 72B Instruct
72B params · 131K context · qwen
Cheapest providerdeepinfra
$/1M input$180000.00
$/1M output$350000.00
Qwen 3 72B InstructB
Qwen 3 72B Instruct
72B params · 131K context · qwen
Cheapest providerfireworks-ai
$/1M input$220000.00
$/1M output$880000.00
Specs and cheapest providers
| Spec | Qwen 2.5 72B Instruct | Qwen 3 72B Instruct |
|---|---|---|
| Parameters | 72B | 72B |
| Context window | 131K tokens | 131K tokens |
| License | qwen | qwen |
| Released | 2024-09-19 | 2025-04-28 |
| Cheapest provider | ||
| Provider | deepinfra | fireworks-ai |
| Input / 1M tokens | $180000.00🏆 | $220000.00 |
| Output / 1M tokens | $350000.00🏆 | $880000.00 |
#6 Qwen 2.5 72B Instruct in cheapest input#5 Qwen 3 72B Instruct in cheapest output#7 Qwen 2.5 72B Instruct in cheapest output#9 Qwen 2.5 72B Instruct in fastest TTFT#10 Qwen 3 72B Instruct in fastest TTFT#9 Qwen 2.5 72B Instruct in highest throughput#10 Qwen 3 72B Instruct in highest throughput#4 Qwen 2.5 72B Instruct in best MMLU#4 Qwen 2.5 72B Instruct in best HumanEval
Add a third model to compare
Benchmark comparison
No benchmark data available for either model yet.
Sample workload — 5M in + 2M out per month
using each model's cheapest providerWhat changes at scale
Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.
1M in · 250K out$267500.00 · $440000.00
5M in · 2M out$1600000.00 · $2860000.00
20M in · 10M out$7100000.00 · $13200000.00
100M in · 60M out$39000000.00 · $74800000.00
Capability vs price
scatter// scatter: benchmark × $/1M out
Calculate cost for your workload
Compare total monthly cost across providers for Qwen 2.5 72B Instruct and Qwen 3 72B Instruct using your own input/output token mix.
Open workload calculator →Editor's take
Qwen 3 72B Instruct is the direct successor to Qwen 2.5 72B Instruct; both are dense 72B models from Alibaba, but the third-generation model ships with meaningful improvements. Qwen 3 72B pushes MMLU from ~86 (2.5) to ~89–90, gains a 128K context window (vs 32K for 2.5), and improves on math and coding benchmarks — HumanEval moves from ~86% to ~90%+. The price premium for Qwen 3 72B is real but modest: roughly $0.80–$1.40/M tokens vs $0.60–$1.10/M for Qwen 2.5 72B across major providers.
For teams already running Qwen 2.5 72B, the migration path is straightforward — same tokenizer family, compatible prompt formats — so switching involves minimal integration overhead.
**Where Qwen 2.5 72B wins:** workloads where the lower price is the primary driver and existing MMLU ~86 quality is already meeting SLA. It has broader provider availability and is more likely to be available on legacy GPU SKUs with proven uptime records.
**Where Qwen 3 72B wins:** any workload pushing the boundaries of the 2.5 generation — longer documents that exceed 32K tokens, math and reasoning tasks where the accuracy delta matters, or new deployments where there's no sunk cost in the older model.
Pick [Qwen 2.5 72B Instruct](/models/alibaba--qwen-2.5-72b-instruct) when you're already on it and the cost savings outweigh the quality increment. Pick [Qwen 3 72B Instruct](/models/alibaba--qwen-3-72b-instruct) for new deployments or any workload where the 128K context window or improved reasoning benchmarks pay off.
Related comparisons
Full model details