Mistral Large 2 vs Qwen 2.5 72B Instruct (2026) — pricing, benchmarks, cheapest providers

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

Mistral Large 2

Qwen 2.5 72B Instruct

Mistral Large 2A

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter

$/1M input$1800000.00

$/1M output$5400000.00

Qwen 2.5 72B InstructB

Qwen 2.5 72B Instruct

72B params · 131K context · qwen

Cheapest providerdeepinfra

$/1M input$180000.00

$/1M output$350000.00

Specs and cheapest providers

Spec	Mistral Large 2	Qwen 2.5 72B Instruct
Parameters	123B	72B
Context window	131K tokens	131K tokens
License	mistral-research	qwen
Released	2024-07-24	2024-09-19
Cheapest provider
Provider	openrouter	deepinfra
Input / 1M tokens	$1800000.00	$180000.00🏆
Output / 1M tokens	$5400000.00	$350000.00🏆

#6 Qwen 2.5 72B Instruct in cheapest input #7 Qwen 2.5 72B Instruct in cheapest output #9 Qwen 2.5 72B Instruct in fastest TTFT #9 Qwen 2.5 72B Instruct in highest throughput #4 Qwen 2.5 72B Instruct in best MMLU #4 Qwen 2.5 72B Instruct in best HumanEval

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider

Mistral Large 2

$19800000.00 /mo

Qwen 2.5 72B Instruct

$1600000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$3150000.00 · $267500.00

5M in · 2M out$19800000.00 · $1600000.00

20M in · 10M out$90000000.00 · $7100000.00

100M in · 60M out$504000000.00 · $39000000.00

Capability vs price

scatter

// scatter: benchmark × $/1M out

Calculate cost for your workload

Compare total monthly cost across providers for Mistral Large 2 and Qwen 2.5 72B Instruct using your own input/output token mix.

Open workload calculator →

Editor's take

The core tradeoff here is parameter count versus price. [Mistral Large 2](/models/mistralai--mistral-large-2) runs 123B parameters; [Qwen 2.5 72B Instruct](/models/alibaba--qwen-2.5-72b-instruct) comes in at 72B. Across commodity providers, Qwen 2.5 72B consistently prices 40–55% cheaper per million tokens on both input and output, simply because it fits on fewer GPUs and competes in a more crowded supply market. Benchmark scores narrow that gap considerably. On MMLU and coding evals, Qwen 2.5 72B punches well above its size — Alibaba's post-training work on math and code reasoning shows in practice. For most general-purpose tasks, quality differences are marginal unless you're doing highly nuanced multi-step reasoning or complex function-call chaining. **Where Mistral Large 2 wins:** Tasks demanding deep reasoning over long contexts (64K+ tokens), multi-document synthesis, and agentic pipelines with dense tool-use schemas. Its larger capacity handles context degradation more gracefully at high fill ratios. **Where Qwen 2.5 72B Instruct wins:** Cost-constrained API products, coding assistants, math-heavy workflows, and any workload where you can validate quality cheaply and scale horizontally. The model's strong multilingual coverage (29+ languages) also makes it preferable for non-English inference at scale. Pick Mistral Large 2 if you need headroom for complex reasoning over very long inputs and budget isn't the primary constraint. Pick Qwen 2.5 72B Instruct if you're running high-volume inference and want solid quality at 40–55% lower cost per token.

Related comparisons

Mistral Large 2 vs Deepseek V3.2 →Mistral Large 2 vs Llama 3.1 405b Instruct →Qwen 2.5 72b Instruct vs Llama 3.3 70b Instruct →Mistral Large 2 vs Llama 3.3 70b Instruct →

Full model details

All providers for Mistral Large 2 →All providers for Qwen 2.5 72B Instruct →