Head to headMay 27, 2026

DeepSeek R1 vs Qwen 3 72B Instruct

Side-by-side on verified pricing, benchmarks, and provider availability.

DimensionDeepSeek R1Qwen 3 72B Instruct

Cheapest $/1M out$2.00$0.45

Cheapest $/1M in$0.40$0.23

Cheapest providerDeepInfraDeepInfra

Capabilities

Context window131K131K

Parameters671B72B

Licensemitqwen

Released2025-01-202025-04-28

Verdict

Scale versus specialization. [Qwen 3 72B Instruct](/models/alibaba--qwen-3-72b-instruct) is Alibaba's latest 72B general-purpose model — competitive on coding and multilingual benchmarks, with a 128K context window and pricing typically in the $0.20–$0.50/1M range. DeepSeek R1 is a reasoning-specialized model trained with reinforcement learning for chain-of-thought derivation, running $0.50–$1.50/1M with additional thinking-token costs on complex tasks.

The pricing gap is 2–4x, which means Qwen 3 72B Instruct wins by default on any workload where both models produce acceptable accuracy. That's a meaningful share of production use cases — RAG pipelines, classification, summarization, and tool-augmented agentic tasks where the 72B parameter base is sufficient.

[DeepSeek R1](/models/deepseek--deepseek-r1) pulls ahead on tasks that explicitly require extended reasoning: AIME-level math, multi-step code debugging where you need the model to articulate why an approach fails, or scientific Q&A where intermediate derivation steps affect downstream tool calls. On MATH-500 and similar benchmarks, R1's scores are significantly higher than any 72B model regardless of training approach.

Qwen 3 72B Instruct earns its place for multilingual enterprise workloads — Chinese-English bilingual processing, mixed-language document pipelines, or any application where Alibaba's multilingual pretraining corpus shows. It's also a credible choice for high-throughput coding assistance at lower cost than R1.

Pick DeepSeek R1 if the task requires verifiable multi-step reasoning and cost is secondary. Pick Qwen 3 72B Instruct for multilingual workloads, cost-sensitive inference, or general-purpose tasks where 72B capability is sufficient.

Sample workload

5M in + 2M out / month — cheapest provider each

DeepSeek R1

$6.00/mo

Qwen 3 72B Instruct

$2.05/mo

More matchups:Deepseek R1 vs Llama 3.1 405b Instruct Deepseek R1 vs Deepseek V3 Deepseek R1 vs Deepseek V3.2 Qwen 3 72b Instruct vs Deepseek V3.2

What changes at scale

$/mo estimate

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$0.90 · $0.34

5M in · 2M out$6.00 · $2.05

20M in · 10M out$28.00 · $9.10

100M in · 60M out$160.00 · $50.00

Calculate cost for your workload

Compare total monthly cost across providers for DeepSeek R1 and Qwen 3 72B Instruct using your own input/output token mix.

Open workload calculator →

Full model details

All providers for DeepSeek R1 →All providers for Qwen 3 72B Instruct →