0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

Mistral Large 2
vs
Qwen 3 72B Instruct
Mistral Large 2A

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter
$/1M input$1800000.00
$/1M output$5400000.00
Qwen 3 72B InstructB

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai
$/1M input$220000.00
$/1M output$880000.00
Specs and cheapest providers
SpecMistral Large 2Qwen 3 72B Instruct
Parameters123B72B
Context window131K tokens131K tokens
Licensemistral-researchqwen
Released2024-07-242025-04-28
Cheapest provider
Provideropenrouterfireworks-ai
Input / 1M tokens$1800000.00$220000.00🏆
Output / 1M tokens$5400000.00$880000.00🏆

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider
Mistral Large 2
$19800000.00 /mo
Qwen 3 72B Instruct
$2860000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$3150000.00 · $440000.00
5M in · 2M out$19800000.00 · $2860000.00
20M in · 10M out$90000000.00 · $13200000.00
100M in · 60M out$504000000.00 · $74800000.00

Capability vs price

scatter
// scatter: benchmark × $/1M out
Calculate cost for your workload

Compare total monthly cost across providers for Mistral Large 2 and Qwen 3 72B Instruct using your own input/output token mix.

Open workload calculator →
Editor's take
[Mistral Large 2](/models/mistralai--mistral-large-2) is a 123B dense model; [Qwen 3 72B Instruct](/models/alibaba--qwen-3-72b-instruct) is Alibaba's third-generation 72B model with a hybrid thinking architecture that can toggle extended chain-of-thought reasoning per request. Pricing on Qwen 3 72B typically lands 35–50% below Mistral Large 2 on output tokens — but that gap narrows significantly when you enable thinking mode, which multiplies token count. Qwen 3 72B's thinking mode changes the comparison dynamic. For tasks like multi-step math, competitive coding, and complex logical deduction, it can match or exceed Mistral Large 2 quality by spending more tokens internally. In non-thinking mode it's a cost-efficient general model; in thinking mode it trades latency and cost for depth. **Where [Mistral Large 2](/models/mistralai--mistral-large-2) wins:** Latency-sensitive production workloads where you can't afford 2–5x token overhead from extended reasoning. Dense document processing, real-time chat, and structured extraction tasks where consistent, fast inference matters more than occasional deeper reasoning. **Where Qwen 3 72B Instruct wins:** Developer environments, agentic loops requiring genuine multi-step problem solving, and code generation tasks where you can selectively enable thinking mode for hard subtasks. Its stronger math and reasoning benchmarks in thinking mode make it the right pick when answer quality outweighs inference cost. Pick Mistral Large 2 if you need predictable low-latency outputs across varied workloads without per-request reasoning overhead. Pick Qwen 3 72B Instruct if you're building reasoning-heavy agents or want a dual-mode model that scales quality with compute budget.
Related comparisons
Full model details