Model crosswalk
Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.
Mistral Large 2
vs
Qwen 2.5 72B Instruct
Mistral Large 2A
Mistral Large 2
123B params · 131K context · mistral-research
Cheapest provideropenrouter
$/1M input$1800000.00
$/1M output$5400000.00
Qwen 2.5 72B InstructB
Qwen 2.5 72B Instruct
72B params · 131K context · qwen
Cheapest providerdeepinfra
$/1M input$180000.00
$/1M output$350000.00
Specs and cheapest providers
| Spec | Mistral Large 2 | Qwen 2.5 72B Instruct |
|---|---|---|
| Parameters | 123B | 72B |
| Context window | 131K tokens | 131K tokens |
| License | mistral-research | qwen |
| Released | 2024-07-24 | 2024-09-19 |
| Cheapest provider | ||
| Provider | openrouter | deepinfra |
| Input / 1M tokens | $1800000.00 | $180000.00🏆 |
| Output / 1M tokens | $5400000.00 | $350000.00🏆 |
Add a third model to compare
Benchmark comparison
No benchmark data available for either model yet.
Sample workload — 5M in + 2M out per month
using each model's cheapest providerWhat changes at scale
Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.
1M in · 250K out$3150000.00 · $267500.00
5M in · 2M out$19800000.00 · $1600000.00
20M in · 10M out$90000000.00 · $7100000.00
100M in · 60M out$504000000.00 · $39000000.00
Capability vs price
scatter// scatter: benchmark × $/1M out
Calculate cost for your workload
Compare total monthly cost across providers for Mistral Large 2 and Qwen 2.5 72B Instruct using your own input/output token mix.
Open workload calculator →Editor's take
The core tradeoff here is parameter count versus price. [Mistral Large 2](/models/mistralai--mistral-large-2) runs 123B parameters; [Qwen 2.5 72B Instruct](/models/alibaba--qwen-2.5-72b-instruct) comes in at 72B. Across commodity providers, Qwen 2.5 72B consistently prices 40–55% cheaper per million tokens on both input and output, simply because it fits on fewer GPUs and competes in a more crowded supply market.
Benchmark scores narrow that gap considerably. On MMLU and coding evals, Qwen 2.5 72B punches well above its size — Alibaba's post-training work on math and code reasoning shows in practice. For most general-purpose tasks, quality differences are marginal unless you're doing highly nuanced multi-step reasoning or complex function-call chaining.
**Where Mistral Large 2 wins:** Tasks demanding deep reasoning over long contexts (64K+ tokens), multi-document synthesis, and agentic pipelines with dense tool-use schemas. Its larger capacity handles context degradation more gracefully at high fill ratios.
**Where Qwen 2.5 72B Instruct wins:** Cost-constrained API products, coding assistants, math-heavy workflows, and any workload where you can validate quality cheaply and scale horizontally. The model's strong multilingual coverage (29+ languages) also makes it preferable for non-English inference at scale.
Pick Mistral Large 2 if you need headroom for complex reasoning over very long inputs and budget isn't the primary constraint. Pick Qwen 2.5 72B Instruct if you're running high-volume inference and want solid quality at 40–55% lower cost per token.
Related comparisons
Full model details