0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek V3
vs
Mistral Large 2
vs
Qwen 3 72B Instruct
DeepSeek V3A

DeepSeek V3

671B params · 131K context · deepseek

Cheapest providerdeepinfra
$/1M input$200000.00
$/1M output$850000.00
Mistral Large 2B

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter
$/1M input$1800000.00
$/1M output$5400000.00
Qwen 3 72B InstructC

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai
$/1M input$220000.00
$/1M output$880000.00
Specs and cheapest providers
SpecDeepSeek V3Mistral Large 2Qwen 3 72B Instruct
Parameters671B123B72B
Context window131K tokens131K tokens131K tokens
Licensedeepseekmistral-researchqwen
Released2024-12-262024-07-242025-04-28
Cheapest provider
Providerdeepinfraopenrouterfireworks-ai
Input / 1M tokens$200000.00🏆$1800000.00$220000.00
Output / 1M tokens$850000.00🏆$5400000.00$880000.00
Benchmark comparison

No benchmark data available yet.

Editor's take
Three serious competitors from the late-2024 frontier tier, all with 131K context windows and strong benchmark profiles — but with meaningfully different cost trajectories heading into 2026. DeepSeek V3 is the 671B-parameter mixture-of-experts model from December 2024, routing tokens through 8 of 256 experts for roughly 37B active parameters per pass. At launch, it was among the most capable open models on code and math benchmarks relative to its effective inference cost. The key context in 2026: DeepSeek V3.2 shipped in May 2025 with roughly 30% lower inference pricing. V3 remains hosted on DeepInfra, Fireworks, and OpenRouter but is now the legacy variant — if you are starting fresh, V3.2 is the current-generation choice. DeepSeek's license requires verification for commercial use. Mistral Large 2 is Mistral AI's 123B flagship from July 2024, positioned as a strong general-purpose model with competitive MMLU and coding scores. It performs well on French and European-language benchmarks relative to peers, reflecting Mistral's European origin. Hosted through Mistral's own API and selected providers. License terms are Mistral's own Research License, with commercial deployment available through their API. Qwen 3 72B Instruct is Alibaba's April 2025 model — the newest of the three, with strong multilingual coverage that spans CJK and Arabic alongside competitive MMLU and HumanEval scores. At 72B it is substantially cheaper to serve than either V3 or Mistral Large 2 at full activation count, and provider coverage on mainstream platforms is wide. Pick DeepSeek V3.2 (over V3) when MoE inference efficiency and top coding benchmarks are the priority. Pick Mistral Large 2 when European-language quality and Mistral's API ecosystem are relevant. Pick Qwen 3 72B for multilingual breadth and the best cost-to-capability ratio at the 72B tier.
Compare two at a time
Frequently asked questions
How does DeepSeek V3 compare to Mistral Large 2 and Qwen 3 72B Instruct on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek V3, Mistral Large 2, or Qwen 3 72B Instruct?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek V3, Mistral Large 2, and Qwen 3 72B Instruct?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details