0 providers0 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Qwen 3 14b Instruct
vs
Qwen 3 32b Instruct
vs
Qwen 3 72b Instruct
Qwen 3 14b InstructA

Qwen 3 14b Instruct

Cheapest provider
$/1M input
$/1M output
Qwen 3 32b InstructB

Qwen 3 32b Instruct

Cheapest provider
$/1M input
$/1M output
Qwen 3 72b InstructC

Qwen 3 72b Instruct

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecQwen 3 14b InstructQwen 3 32b InstructQwen 3 72b Instruct
Parameters
Context window
License
Released
Cheapest provider
Provider
Input / 1M tokens
Output / 1M tokens
Benchmark comparison

No benchmark data available yet.

Editor's take
Qwen 3 14B, 32B, and 72B Instruct represent the production-deployable mid-to-large tier of Alibaba's Qwen 3 generation, all released in 2025 under the Qwen commercial license with 131K context windows. All three share the same multilingual instruction-tuning improvements that define the Qwen 3 line — stronger CJK and Arabic performance than the prior Qwen 2.5 generation — but separate clearly on benchmark performance and infrastructure requirements. The 14B is a compact production model whose latency profile aligns with the Mistral Small 3 and Llama 3.1 8B tier, yet scores ahead of those peers on multilingual evaluations. For products with significant East Asian user traffic and tight latency requirements, it frequently wins against heavier alternatives. Its 131K context handles most retrieval workloads without chunking. The 32B sits in a cost-effective middle zone, delivering approximately 85% of the 72B's benchmark performance at roughly half the per-token cost at most hosted providers. It handles mixed-language coding tasks, complex summarization over long documents, and multi-step reasoning competently. Teams looking to avoid 72B pricing while retaining most of the model's quality will find the 32B a pragmatic choice. The 72B is the flagship of the three. It consistently benchmarks at or above Qwen 2.5 72B and competes with Llama 3.3 70B on general English tasks while maintaining the multilingual advantage. For user-facing applications where output quality is visible and the cost of serving 72B parameters is acceptable, this is the Qwen 3 variant to deploy. Pick the 14B for latency-sensitive workloads with multilingual requirements. Pick the 32B for applications that need strong quality without 72B overhead. Pick the 72B when quality is the primary constraint and infrastructure cost is secondary.
Compare two at a time
Frequently asked questions
How does Qwen 3 14b Instruct compare to Qwen 3 32b Instruct and Qwen 3 72b Instruct on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Qwen 3 14b Instruct, Qwen 3 32b Instruct, or Qwen 3 72b Instruct?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Qwen 3 14b Instruct, Qwen 3 32b Instruct, and Qwen 3 72b Instruct?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details