How does DeepSeek V3.2 compare to Phi-3 Medium 128K and Qwen 3 72B Instruct on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: DeepSeek V3.2, Phi-3 Medium 128K, or Qwen 3 72B Instruct?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for DeepSeek V3.2, Phi-3 Medium 128K, and Qwen 3 72B Instruct?

Context window sizes are listed in the Specs row of the comparison table above.

Deepseek V3.2 vs Phi 3 Medium 128k vs Qwen 3 72b Instruct (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek V3.2

Phi-3 Medium 128K

Qwen 3 72B Instruct

DeepSeek V3.2A

DeepSeek V3.2

671B params · 131K context · deepseek

Cheapest providertogether-ai

$/1M input$270000.00

$/1M output$1100000.00

Phi-3 Medium 128KB

Phi-3 Medium 128K

14B params · 131K context · mit

Cheapest provider—

$/1M input—

$/1M output—

Qwen 3 72B InstructC

Qwen 3 72B Instruct

72B params · 131K context · qwen

Cheapest providerfireworks-ai

$/1M input$220000.00

$/1M output$880000.00

Specs and cheapest providers

Spec	DeepSeek V3.2	Phi-3 Medium 128K	Qwen 3 72B Instruct
Parameters	671B	14B	72B
Context window	131K tokens	131K tokens	131K tokens
License	deepseek	mit	qwen
Released	2025-05-07	2024-05-21	2025-04-28
Cheapest provider
Provider	together-ai	—	fireworks-ai
Input / 1M tokens	$270000.00	—	$220000.00🏆
Output / 1M tokens	$1100000.00	—	$880000.00🏆

Benchmark comparison

No benchmark data available yet.

Editor's take

This comparison spans three genuinely different architectures at three price points, making it one of the more practically useful comparisons in the sub-frontier tier. DeepSeek V3.2 is a mixture-of-experts model from DeepSeek, the May 2025 successor to V3, with a reported ~30% reduction in inference pricing over its predecessor. It routes each token through a subset of a large total parameter count, making its effective inference cost much lower than a comparably capable dense model. On coding, math, and general reasoning benchmarks it punches well above what its inference cost implies, which is why it drew significant attention at launch. The 131K context window is accessible across DeepInfra, Fireworks, and OpenRouter. Verify DeepSeek's commercial license terms before deploying. Phi-3 Medium 128K, at 14 billion dense parameters, trades breadth for efficiency. Microsoft's synthetic training data drives MMLU and GSM8K scores above most 14B peers, but the model does not cover the full range of benchmarks where DeepSeek V3.2 and Qwen 3 72B compete. Its primary advantage is cost: at 14B with MIT licensing, it is the lowest-cost option in this comparison for structured reasoning tasks. Provider availability skews toward Azure AI. Qwen 3 72B Instruct from Alibaba covers the broadest benchmark surface in this group — strong multilingual, code, and reasoning coverage — at 72B parameters with a 131K context window and Qwen commercial license. It is the most predictable all-rounder of the three. Pick DeepSeek V3.2 when raw benchmark performance per inference dollar is the primary metric and you can tolerate MoE operational complexity. Pick Qwen 3 72B for reliable breadth across multilingual and coding workloads. Pick Phi-3 Medium 128K when the task is narrow, structured, and cost-sensitive.

Compare two at a time

DeepSeek V3.2 vs Phi-3 Medium 128K DeepSeek V3.2 vs Qwen 3 72B Instruct Phi-3 Medium 128K vs Qwen 3 72B Instruct

Frequently asked questions

How does DeepSeek V3.2 compare to Phi-3 Medium 128K and Qwen 3 72B Instruct on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek V3.2, Phi-3 Medium 128K, or Qwen 3 72B Instruct?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek V3.2, Phi-3 Medium 128K, and Qwen 3 72B Instruct?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for DeepSeek V3.2 →All providers for Phi-3 Medium 128K →All providers for Qwen 3 72B Instruct →