How does DeepSeek V3 compare to Mistral Large 2 and Qwen 3 72B Instruct on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: DeepSeek V3, Mistral Large 2, or Qwen 3 72B Instruct?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for DeepSeek V3, Mistral Large 2, and Qwen 3 72B Instruct?

Context window sizes are listed in the Specs row of the comparison table above.

Deepseek V3 vs Mistral Large 2 vs Qwen 3 72b Instruct (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek V3

Mistral Large 2

Qwen 3 72B Instruct

DeepSeek V3A

DeepSeek V3

671B params · 131K context · deepseek

Cheapest providerdeepinfra

$/1M input$200000.00

$/1M output$850000.00

Mistral Large 2B

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter

$/1M input$1800000.00

$/1M output$5400000.00

Qwen 3 72B InstructC

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai

$/1M input$220000.00

$/1M output$880000.00

Specs and cheapest providers

Spec	DeepSeek V3	Mistral Large 2	Qwen 3 72B Instruct
Parameters	671B	123B	72B
Context window	131K tokens	131K tokens	131K tokens
License	deepseek	mistral-research	qwen
Released	2024-12-26	2024-07-24	2025-04-28
Cheapest provider
Provider	deepinfra	openrouter	fireworks-ai
Input / 1M tokens	$200000.00🏆	$1800000.00	$220000.00
Output / 1M tokens	$850000.00🏆	$5400000.00	$880000.00

Benchmark comparison

No benchmark data available yet.

Editor's take

Three serious competitors from the late-2024 frontier tier, all with 131K context windows and strong benchmark profiles — but with meaningfully different cost trajectories heading into 2026. DeepSeek V3 is the 671B-parameter mixture-of-experts model from December 2024, routing tokens through 8 of 256 experts for roughly 37B active parameters per pass. At launch, it was among the most capable open models on code and math benchmarks relative to its effective inference cost. The key context in 2026: DeepSeek V3.2 shipped in May 2025 with roughly 30% lower inference pricing. V3 remains hosted on DeepInfra, Fireworks, and OpenRouter but is now the legacy variant — if you are starting fresh, V3.2 is the current-generation choice. DeepSeek's license requires verification for commercial use. Mistral Large 2 is Mistral AI's 123B flagship from July 2024, positioned as a strong general-purpose model with competitive MMLU and coding scores. It performs well on French and European-language benchmarks relative to peers, reflecting Mistral's European origin. Hosted through Mistral's own API and selected providers. License terms are Mistral's own Research License, with commercial deployment available through their API. Qwen 3 72B Instruct is Alibaba's April 2025 model — the newest of the three, with strong multilingual coverage that spans CJK and Arabic alongside competitive MMLU and HumanEval scores. At 72B it is substantially cheaper to serve than either V3 or Mistral Large 2 at full activation count, and provider coverage on mainstream platforms is wide. Pick DeepSeek V3.2 (over V3) when MoE inference efficiency and top coding benchmarks are the priority. Pick Mistral Large 2 when European-language quality and Mistral's API ecosystem are relevant. Pick Qwen 3 72B for multilingual breadth and the best cost-to-capability ratio at the 72B tier.

Compare two at a time

DeepSeek V3 vs Mistral Large 2 DeepSeek V3 vs Qwen 3 72B Instruct Mistral Large 2 vs Qwen 3 72B Instruct

Frequently asked questions

How does DeepSeek V3 compare to Mistral Large 2 and Qwen 3 72B Instruct on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek V3, Mistral Large 2, or Qwen 3 72B Instruct?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek V3, Mistral Large 2, and Qwen 3 72B Instruct?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for DeepSeek V3 →All providers for Mistral Large 2 →All providers for Qwen 3 72B Instruct →