How does Llama 3.1 405B Instruct compare to Mistral Large 2 and Qwen 3 72B Instruct on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: Llama 3.1 405B Instruct, Mistral Large 2, or Qwen 3 72B Instruct?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for Llama 3.1 405B Instruct, Mistral Large 2, and Qwen 3 72B Instruct?

Context window sizes are listed in the Specs row of the comparison table above.

Llama 3.1 405b Instruct vs Mistral Large 2 vs Qwen 3 72b Instruct (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Llama 3.1 405B Instruct

Mistral Large 2

Qwen 3 72B Instruct

Llama 3.1 405B InstructA

Llama 3.1 405B Instruct

405B params · 131K context · llama-3

Cheapest providerdeepinfra

$/1M input$2700000.00

$/1M output$8000000.00

Mistral Large 2B

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter

$/1M input$1800000.00

$/1M output$5400000.00

Qwen 3 72B InstructC

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai

$/1M input$220000.00

$/1M output$880000.00

Specs and cheapest providers

Spec	Llama 3.1 405B Instruct	Mistral Large 2	Qwen 3 72B Instruct
Parameters	405B	123B	72B
Context window	131K tokens	131K tokens	131K tokens
License	llama-3	mistral-research	qwen
Released	2024-07-23	2024-07-24	2025-04-28
Cheapest provider
Provider	deepinfra	openrouter	fireworks-ai
Input / 1M tokens	$2700000.00	$1800000.00	$220000.00🏆
Output / 1M tokens	$8000000.00	$5400000.00	$880000.00🏆

Benchmark comparison

No benchmark data available yet.

Editor's take

Llama 3.1 405B Instruct, Mistral Large 2, and Qwen 3 72B Instruct span a range of parameter counts and publisher strategies, but all target high-quality open-weights inference for production use. Meta's 405B (July 2024, Llama 3 community license) sits at the frontier of what's accessible in open weights; Mistral Large 2 (July 2024, Mistral Research license) and Qwen 3 72B (Alibaba, 2025, Qwen license) compete at the 70B-class tier with different multilingual and licensing profiles. Qwen 3 72B is the most current of the three, bringing 2025-generation instruction tuning and strong multilingual performance across CJK and Arabic alongside 131K context. For teams serving non-English users, its multilingual advantage over both alternatives is tangible. The Qwen commercial license permits deployment without restrictions. Mistral Large 2 delivers competitive multilingual performance across European languages and solid coding evals at the 70B scale. The Mistral Research license limits fully commercial use without an enterprise agreement — a real constraint for teams that need unrestricted deployment or fine-tuning rights. For organizations in an existing Mistral relationship, the quality-per-token profile remains competitive with the Qwen 3 72B. Llama 3.1 405B is in a different operating tier. At 405 billion parameters it handles tasks that genuinely expose 70B limitations — complex multi-step reasoning, long-form synthesis, advanced coding workflows. Multi-GPU infrastructure requirements and thinner provider availability make it a targeted choice rather than a volume inference default. Per-token cost is substantially higher than either 70B-class alternative. Pick Qwen 3 72B for general-purpose or multilingual production deployments. Pick Mistral Large 2 within an existing Mistral commercial agreement. Pick Llama 3.1 405B only when task complexity demonstrably saturates 70B-class models and the infrastructure cost is justified.

Compare two at a time

Llama 3.1 405B Instruct vs Mistral Large 2 Llama 3.1 405B Instruct vs Qwen 3 72B Instruct Mistral Large 2 vs Qwen 3 72B Instruct

Frequently asked questions

How does Llama 3.1 405B Instruct compare to Mistral Large 2 and Qwen 3 72B Instruct on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Llama 3.1 405B Instruct, Mistral Large 2, or Qwen 3 72B Instruct?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Llama 3.1 405B Instruct, Mistral Large 2, and Qwen 3 72B Instruct?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for Llama 3.1 405B Instruct →All providers for Mistral Large 2 →All providers for Qwen 3 72B Instruct →