How does Mistral Large 2 compare to Nemotron-4 340B Instruct and Qwen 3 72B Instruct on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: Mistral Large 2, Nemotron-4 340B Instruct, or Qwen 3 72B Instruct?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for Mistral Large 2, Nemotron-4 340B Instruct, and Qwen 3 72B Instruct?

Context window sizes are listed in the Specs row of the comparison table above.

Mistral Large 2 vs Nemotron 4 340b Instruct vs Qwen 3 72b Instruct (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Mistral Large 2

Nemotron-4 340B Instruct

Qwen 3 72B Instruct

Mistral Large 2A

Mistral Large 2

123B params · 131K context · mistral-research

Cheapest provideropenrouter

$/1M input$1800000.00

$/1M output$5400000.00

Nemotron-4 340B InstructB

Nemotron-4 340B Instruct

340B params · 4K context · nvidia-open-model

Cheapest provider—

$/1M input—

$/1M output—

Qwen 3 72B InstructC

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai

$/1M input$220000.00

$/1M output$880000.00

Specs and cheapest providers

Spec	Mistral Large 2	Nemotron-4 340B Instruct	Qwen 3 72B Instruct
Parameters	123B	340B	72B
Context window	131K tokens	4K tokens	131K tokens
License	mistral-research	nvidia-open-model	qwen
Released	2024-07-24	2024-06-14	2025-04-28
Cheapest provider
Provider	openrouter	—	fireworks-ai
Input / 1M tokens	$1800000.00	—	$220000.00🏆
Output / 1M tokens	$5400000.00	—	$880000.00🏆

Benchmark comparison

No benchmark data available yet.

Editor's take

Three publisher flagships with almost nothing in common beyond their billing tier. Mistral Large 2 is a 123B-parameter dense model released July 2024 by France-based Mistral AI, built around European multilingual quality, structured output reliability, and tight integration with Mistral's managed API. Function calling and JSON mode are first-class. The 128K context window is matched by few rivals at this price point. The Mistral Research License restricts self-hosted commercial use, so production workloads generally go through Mistral's own API or require an enterprise agreement. Nemotron-4 340B Instruct is NVIDIA's flagship open release, a dense 340-billion-parameter model released June 2024. Its design purpose is synthetic training data generation rather than general-purpose chat — NVIDIA explicitly positioned it as a reference model for producing diverse, high-quality instruction datasets for fine-tuning smaller models. The 4K context ceiling is a hard constraint that rules it out for most document-processing and RAG use cases. Hosting concentrates on NVIDIA's NIM service, and the NVIDIA Open Model License is not OSI-approved. If you are not building synthetic data pipelines, the cost-to-benefit is difficult to justify. Qwen 3 72B Instruct is Alibaba's April 2025 flagship open model at 72 billion parameters, inheriting a 131K context window and adding meaningfully improved multilingual coverage across CJK, Arabic, and European languages over its 2.5-series predecessor. Benchmark performance is competitive with Mistral Large 2 on most English-language tasks and surpasses it on multilingual evals. Pick Mistral Large 2 for European-language enterprise workloads with managed API support. Pick Nemotron-4 340B only for synthetic data generation at scale. Pick Qwen 3 72B as the cost-effective option when multilingual breadth and long-context support are both required.

Compare two at a time

Mistral Large 2 vs Nemotron-4 340B Instruct Mistral Large 2 vs Qwen 3 72B Instruct Nemotron-4 340B Instruct vs Qwen 3 72B Instruct

Frequently asked questions

How does Mistral Large 2 compare to Nemotron-4 340B Instruct and Qwen 3 72B Instruct on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Mistral Large 2, Nemotron-4 340B Instruct, or Qwen 3 72B Instruct?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Mistral Large 2, Nemotron-4 340B Instruct, and Qwen 3 72B Instruct?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for Mistral Large 2 →All providers for Nemotron-4 340B Instruct →All providers for Qwen 3 72B Instruct →