0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

DeepSeek R1
vs
DeepSeek V3
DeepSeek R1A

DeepSeek R1

671B params · 131K context · mit

Cheapest providerdeepinfra
$/1M input$400000.00
$/1M output$2000000.00
DeepSeek V3B

DeepSeek V3

671B params · 131K context · deepseek

Cheapest providerdeepinfra
$/1M input$200000.00
$/1M output$850000.00
Specs and cheapest providers
SpecDeepSeek R1DeepSeek V3
Parameters671B671B
Context window131K tokens131K tokens
Licensemitdeepseek
Released2025-01-202024-12-26
Cheapest provider
Providerdeepinfradeepinfra
Input / 1M tokens$400000.00$200000.00🏆
Output / 1M tokens$2000000.00$850000.00🏆

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider
DeepSeek R1
$6000000.00 /mo
DeepSeek V3
$2700000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$900000.00 · $412500.00
5M in · 2M out$6000000.00 · $2700000.00
20M in · 10M out$28000000.00 · $12500000.00
100M in · 60M out$160000000.00 · $71000000.00

Capability vs price

scatter
// scatter: benchmark × $/1M out
Calculate cost for your workload

Compare total monthly cost across providers for DeepSeek R1 and DeepSeek V3 using your own input/output token mix.

Open workload calculator →
Editor's take
Same lab, fundamentally different design objectives. DeepSeek R1 is a reinforcement-learning-trained reasoning model — it generates extended chain-of-thought traces and excels on tasks where explicit step-by-step derivation beats retrieval. [DeepSeek V3](/models/deepseek--deepseek-v3) is a 671B Mixture-of-Experts dense-context model optimized for throughput, general capability, and instruction following at scale, with ~37B active parameters per forward pass. The cost picture separates them clearly. DeepSeek V3 on commodity providers typically runs $0.14–$0.28/1M input tokens. DeepSeek R1 carries a premium — expect $0.50–$1.50/1M depending on provider — because the extended thinking overhead and GPU-hours per token are meaningfully higher. If you're running tens of millions of tokens per day, that delta compounds fast. [DeepSeek R1](/models/deepseek--deepseek-r1) is the right call for tasks where accuracy on hard reasoning problems justifies the cost: competitive math, multi-step code proofs, formal verification assistance, or research workflows where you're willing to pay for a model that shows its work and catches its own errors. Its AIME 2024 and MATH-500 scores are significantly above V3. DeepSeek V3 wins for high-volume general workloads: long-form drafting, RAG pipelines over large document sets, classification at scale, or agentic loops where most steps don't require deep reasoning — just reliable instruction execution at low per-token cost. Pick DeepSeek R1 if your task requires verified, step-by-step reasoning and budget allows. Pick DeepSeek V3 if throughput and cost efficiency matter more than reasoning depth.
Related comparisons
Full model details