0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

DeepSeek R1
vs
DeepSeek V3.2
vs
Llama 3.1 405B Instruct
DeepSeek R1A

DeepSeek R1

671B params · 131K context · mit

Cheapest providerdeepinfra
$/1M input$400000.00
$/1M output$2000000.00
DeepSeek V3.2B

DeepSeek V3.2

671B params · 131K context · deepseek

Cheapest providertogether-ai
$/1M input$270000.00
$/1M output$1100000.00
Llama 3.1 405B InstructC

Llama 3.1 405B Instruct

405B params · 131K context · llama-3

Cheapest providerdeepinfra
$/1M input$2700000.00
$/1M output$8000000.00
Specs and cheapest providers
SpecDeepSeek R1DeepSeek V3.2Llama 3.1 405B Instruct
Parameters671B671B405B
Context window131K tokens131K tokens131K tokens
Licensemitdeepseekllama-3
Released2025-01-202025-05-072024-07-23
Cheapest provider
Providerdeepinfratogether-aideepinfra
Input / 1M tokens$400000.00$270000.00🏆$2700000.00
Output / 1M tokens$2000000.00$1100000.00🏆$8000000.00
Benchmark comparison

No benchmark data available yet.

Editor's take
Two DeepSeek models and Meta's flagship dense model — each targeting a different point on the capability-cost frontier. DeepSeek R1 is a reasoning-specialized model trained with reinforcement learning to generate explicit chain-of-thought traces before producing final answers. On GPQA Diamond and competition math it outperforms much larger dense models. The chain-of-thought process adds output tokens, which increases both latency and per-query cost, so the premium is appropriate only for tasks where reasoning-trace quality matters — formal proofs, multi-hop scientific QA, or workflows where auditability of the reasoning path is a requirement. Context window is 131K. DeepSeek's commercial license terms need verification before deployment. DeepSeek V3.2 is the May 2025 successor to V3, a mixture-of-experts model with roughly 37B active parameters per forward pass and a ~30% inference-cost reduction over V3. On code, math, and general reasoning benchmarks it delivers performance well above what its inference cost implies, with a 131K context window and broad provider availability. Where R1 optimizes for explicit reasoning depth, V3.2 optimizes for cost-efficient general capability across a broad task surface. Same commercial license caveat applies. Llama 3.1 405B Instruct at 405B dense parameters offers the broadest knowledge coverage of the three — MMLU scores near the top of open-weights models at its July 2024 release, strong general-instruction following, 131K context, and the Llama 3 community license for commercial use. Per-token cost is highest in this group due to multi-GPU serving requirements. Pick DeepSeek R1 when chain-of-thought reasoning quality on GPQA-class tasks is the evaluating criterion. Pick DeepSeek V3.2 for strong general performance at the best cost-efficiency ratio. Pick Llama 3.1 405B when licensing flexibility to self-host and broad knowledge coverage are the priority.
Compare two at a time
Frequently asked questions
How does DeepSeek R1 compare to DeepSeek V3.2 and Llama 3.1 405B Instruct on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: DeepSeek R1, DeepSeek V3.2, or Llama 3.1 405B Instruct?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for DeepSeek R1, DeepSeek V3.2, and Llama 3.1 405B Instruct?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details