0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek R1
vs
DeepSeek V3
vs
DeepSeek V3.2
DeepSeek R1A

DeepSeek R1

671B params · 131K context · mit

Cheapest providerdeepinfra
$/1M input$400000.00
$/1M output$2000000.00
DeepSeek V3B

DeepSeek V3

671B params · 131K context · deepseek

Cheapest provider
$/1M input
$/1M output
DeepSeek V3.2C

DeepSeek V3.2

671B params · 131K context · deepseek

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecDeepSeek R1DeepSeek V3DeepSeek V3.2
Parameters671B671B671B
Context window131K tokens131K tokens131K tokens
Licensemitdeepseekdeepseek
Released2025-01-202024-12-262025-05-07
Cheapest provider
Providerdeepinfra
Input / 1M tokens$400000.00
Output / 1M tokens$2000000.00
Benchmark comparison

No benchmark data available yet.

Editor's take
All three models come from DeepSeek and share the same 671-billion-parameter MoE architecture, routing each forward pass through roughly 37B active parameters. What separates them is training objective and release timeline rather than raw scale. DeepSeek V3, released December 2024, was the baseline: a strong general-purpose model that drew attention for matching frontier proprietary models on code and math benchmarks at a fraction of the inference cost. It still works and is hosted on DeepInfra, Fireworks, and OpenRouter, but it is now the legacy variant within this family. DeepSeek R1, also released in early 2025, takes a different approach entirely. Rather than optimizing for throughput, R1 adds explicit chain-of-thought reasoning traces trained via reinforcement learning, which meaningfully improves performance on AIME, MATH, and multi-step logic tasks. The tradeoff is token count: R1 emits substantially more tokens per answer, which drives up latency and cost per query. Its MIT license removes any commercial friction. DeepSeek V3.2, released May 2025, is the cost-efficiency successor to V3. It dropped inference pricing roughly 30 percent relative to V3 while maintaining comparable general-capability benchmarks. For teams that do not need chain-of-thought reasoning traces, V3.2 is simply the better V3 — no architectural reason to stay on the earlier release. Pick R1 if your workload rewards explicit multi-step reasoning and you can absorb the higher per-query token cost. Pick V3.2 for general chat, code generation, and instruction-following at the lowest cost within this family. V3 is worth running only if you already have it pinned and need reproducibility against a specific checkpoint.
Compare two at a time
Frequently asked questions
How does DeepSeek R1 compare to DeepSeek V3 and DeepSeek V3.2 on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek R1, DeepSeek V3, or DeepSeek V3.2?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek R1, DeepSeek V3, and DeepSeek V3.2?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details