0 providers0 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Command R Plus
vs
Llama 3.1 405b Instruct
vs
Qwen 3 72b Instruct
Command R PlusA

Command R Plus

Cheapest provider
$/1M input
$/1M output
Llama 3.1 405b InstructB

Llama 3.1 405b Instruct

Cheapest provider
$/1M input
$/1M output
Qwen 3 72b InstructC

Qwen 3 72b Instruct

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecCommand R PlusLlama 3.1 405b InstructQwen 3 72b Instruct
Parameters
Context window
License
Released
Cheapest provider
Provider
Input / 1M tokens
Output / 1M tokens
Benchmark comparison

No benchmark data available yet.

Editor's take
An enterprise RAG specialist, a frontier-scale dense model, and a balanced multilingual 72B — each makes sense for a different primary workload. Cohere's Command R+ is a 104B-parameter model released April 2024, built explicitly for retrieval-augmented generation and multi-step tool use. It scores well on grounding and function-calling benchmarks, which reflects deliberate architectural and fine-tuning investment in those capabilities rather than simply scaling up. The 131K context window handles long retrieval contexts. The significant constraint is licensing: Command R+ uses Cohere's CC-BY-NC license, which prohibits commercial deployment through third-party inference hosts. Production use channels through Cohere's own API. Teams evaluating Command R+ should factor in the vendor dependency alongside the RAG-quality argument. Llama 3.1 405B Instruct is the largest dense open-weights model in this group at 405B parameters, 131K context, and Llama 3 community license terms that allow commercial use with attribution. MMLU scores near the top of the open-weights tier at launch, broad provider support, and the ability to self-host on sufficient hardware. The cost of inference is significantly higher than either peer — justified when the task genuinely requires frontier-level capability. Qwen 3 72B Instruct from April 2025 covers MMLU, code, and multilingual benchmarks comprehensively at a fraction of 405B pricing. For most RAG pipelines that need broad knowledge retrieval rather than Cohere's specialized grounding stack, Qwen 3 72B is the more practical default. Pick Command R+ when you need Cohere's grounding stack and are willing to route through their API. Pick Llama 3.1 405B when frontier capability and licensing flexibility to self-host are the priority. Pick Qwen 3 72B for general-purpose high-performance inference at reasonable 72B pricing.
Compare two at a time
Frequently asked questions
How does Command R Plus compare to Llama 3.1 405b Instruct and Qwen 3 72b Instruct on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Command R Plus, Llama 3.1 405b Instruct, or Qwen 3 72b Instruct?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Command R Plus, Llama 3.1 405b Instruct, and Qwen 3 72b Instruct?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details