0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

DeepSeek R1
vs
DeepSeek V3.2
DeepSeek R1A

DeepSeek R1

671B params · 131K context · mit

Cheapest providerdeepinfra
$/1M input$400000.00
$/1M output$2000000.00
DeepSeek V3.2B

DeepSeek V3.2

671B params · 131K context · deepseek

Cheapest providertogether-ai
$/1M input$270000.00
$/1M output$1100000.00
Specs and cheapest providers
SpecDeepSeek R1DeepSeek V3.2
Parameters671B671B
Context window131K tokens131K tokens
Licensemitdeepseek
Released2025-01-202025-05-07
Cheapest provider
Providerdeepinfratogether-ai
Input / 1M tokens$400000.00$270000.00🏆
Output / 1M tokens$2000000.00$1100000.00🏆

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider
DeepSeek R1
$6000000.00 /mo
DeepSeek V3.2
$3550000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$900000.00 · $545000.00
5M in · 2M out$6000000.00 · $3550000.00
20M in · 10M out$28000000.00 · $16400000.00
100M in · 60M out$160000000.00 · $93000000.00

Capability vs price

scatter
// scatter: benchmark × $/1M out
Calculate cost for your workload

Compare total monthly cost across providers for DeepSeek R1 and DeepSeek V3.2 using your own input/output token mix.

Open workload calculator →
Editor's take
DeepSeek V3.2 is an incremental update to the V3 MoE architecture, tightening instruction fidelity and improving coding benchmarks relative to V3 without changing the fundamental design — still a 671B MoE with ~37B active parameters. [DeepSeek R1](/models/deepseek--deepseek-r1) remains a distinct product: a reinforcement-learning reasoning model that generates extended thinking traces before committing to an answer. The cost gap is real and persistent. V3.2 continues the V3 pricing trend of $0.14–$0.30/1M input tokens on competitive providers. R1 runs $0.50–$1.50/1M or higher depending on thinking token allocation — the extended CoT chains push per-request GPU cost up materially. For latency-sensitive applications, R1's generation time is also notably longer per turn. [DeepSeek V3.2](/models/deepseek--deepseek-v3.2) earns its place in production RAG pipelines, long-document summarization, and high-throughput agentic scaffolds where the model needs to reliably follow instructions, call tools, and handle 64K+ context at reasonable cost. The V3.2 improvements are most visible in code generation and multi-turn instruction fidelity. DeepSeek R1 remains the model to benchmark against for hard quantitative reasoning: AIME-class math, complex algorithmic proofs, or audit workflows where the chain-of-thought trace is itself a deliverable. On tasks where you're checking the reasoning path — not just the answer — R1's transparency is a feature. Pick DeepSeek R1 if your task demands verifiable, step-by-step derivation and you can absorb the cost and latency premium. Pick DeepSeek V3.2 if you need improved instruction following over V3 at roughly the same price point.
Related comparisons
Full model details