How does DeepSeek R1 compare to DeepSeek V3 and DeepSeek V3.2 on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: DeepSeek R1, DeepSeek V3, or DeepSeek V3.2?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for DeepSeek R1, DeepSeek V3, and DeepSeek V3.2?

Context window sizes are listed in the Specs row of the comparison table above.

Deepseek R1 vs Deepseek V3 vs Deepseek V3.2 (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek R1

DeepSeek V3

DeepSeek V3.2

DeepSeek R1A

DeepSeek R1

671B params · 131K context · mit

Cheapest providerdeepinfra

$/1M input$400000.00

$/1M output$2000000.00

DeepSeek V3B

DeepSeek V3

671B params · 131K context · deepseek

Cheapest provider—

$/1M input—

$/1M output—

DeepSeek V3.2C

DeepSeek V3.2

671B params · 131K context · deepseek

Cheapest provider—

$/1M input—

$/1M output—

Specs and cheapest providers

Spec	DeepSeek R1	DeepSeek V3	DeepSeek V3.2
Parameters	671B	671B	671B
Context window	131K tokens	131K tokens	131K tokens
License	mit	deepseek	deepseek
Released	2025-01-20	2024-12-26	2025-05-07
Cheapest provider
Provider	deepinfra	—	—
Input / 1M tokens	$400000.00	—	—
Output / 1M tokens	$2000000.00	—	—

Benchmark comparison

No benchmark data available yet.

Editor's take

All three models come from DeepSeek and share the same 671-billion-parameter MoE architecture, routing each forward pass through roughly 37B active parameters. What separates them is training objective and release timeline rather than raw scale. DeepSeek V3, released December 2024, was the baseline: a strong general-purpose model that drew attention for matching frontier proprietary models on code and math benchmarks at a fraction of the inference cost. It still works and is hosted on DeepInfra, Fireworks, and OpenRouter, but it is now the legacy variant within this family. DeepSeek R1, also released in early 2025, takes a different approach entirely. Rather than optimizing for throughput, R1 adds explicit chain-of-thought reasoning traces trained via reinforcement learning, which meaningfully improves performance on AIME, MATH, and multi-step logic tasks. The tradeoff is token count: R1 emits substantially more tokens per answer, which drives up latency and cost per query. Its MIT license removes any commercial friction. DeepSeek V3.2, released May 2025, is the cost-efficiency successor to V3. It dropped inference pricing roughly 30 percent relative to V3 while maintaining comparable general-capability benchmarks. For teams that do not need chain-of-thought reasoning traces, V3.2 is simply the better V3 — no architectural reason to stay on the earlier release. Pick R1 if your workload rewards explicit multi-step reasoning and you can absorb the higher per-query token cost. Pick V3.2 for general chat, code generation, and instruction-following at the lowest cost within this family. V3 is worth running only if you already have it pinned and need reproducibility against a specific checkpoint.

Compare two at a time

DeepSeek R1 vs DeepSeek V3 DeepSeek R1 vs DeepSeek V3.2 DeepSeek V3 vs DeepSeek V3.2

Frequently asked questions

How does DeepSeek R1 compare to DeepSeek V3 and DeepSeek V3.2 on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek R1, DeepSeek V3, or DeepSeek V3.2?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek R1, DeepSeek V3, and DeepSeek V3.2?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for DeepSeek R1 →All providers for DeepSeek V3 →All providers for DeepSeek V3.2 →