Model crosswalk
Side-by-side on price, capability and workload — three-way comparison.
Codestral 22b
vs
Qwen 2.5 Coder 32b Instruct
vs
Starcoder2 15b Instruct
Codestral 22bA
Codestral 22b
Cheapest provider—
$/1M input—
$/1M output—
Qwen 2.5 Coder 32b InstructB
Qwen 2.5 Coder 32b Instruct
Cheapest provider—
$/1M input—
$/1M output—
Starcoder2 15b InstructC
Starcoder2 15b Instruct
Cheapest provider—
$/1M input—
$/1M output—
Specs and cheapest providers
| Spec | Codestral 22b | Qwen 2.5 Coder 32b Instruct | Starcoder2 15b Instruct |
|---|---|---|---|
| Parameters | — | — | — |
| Context window | — | — | — |
| License | — | — | — |
| Released | — | — | — |
| Cheapest provider | |||
| Provider | — | — | — |
| Input / 1M tokens | — | — | — |
| Output / 1M tokens | — | — | — |
Benchmark comparison
No benchmark data available yet.
Editor's take
Three coding specialists occupy distinct positions in the under-50B segment — and the differences go beyond raw parameter counts.
Codestral 22B, released by Mistral AI in May 2024, covers 80-plus programming languages across a 32K context window. Its HumanEval performance competes with Qwen 2.5 Coder 7B and DeepSeek Coder V2 Lite for that size class. The critical catch is licensing: Codestral ships under the Mistral Research License, which blocks commercial production deployment without a separate agreement from Mistral. For teams evaluating it as an API backend, that licensing conversation is non-optional.
Qwen 2.5 Coder 32B Instruct, released November 2024 by Alibaba, supports 92 programming languages and brings a 131K context window — four times the context of Codestral for multi-file refactoring tasks. On LiveCodeBench and MultiPL-E it sits alongside DeepSeek Coder V2 as a credible production alternative, at 32B parameters rather than larger model tiers. The Qwen license permits commercial deployment, and hosting is available across DeepInfra and other inference providers at competitive per-token rates.
StarCoder2 15B Instruct, from the BigCode collaboration (HuggingFace and ServiceNow), released September 2024, runs on a 16K context and is not the benchmark leader among these three. Its differentiated value is training-data provenance: every training example is drawn from The Stack v2, restricted to permissively licensed source code. For regulated industries with strict IP policies around training data, that traceability is worth more than a benchmark lead. It ships under BigCode OpenRAIL-M, commercially usable with narrow restrictions.
Pick Codestral 22B for non-commercial research and local IDE experiments where the license fits. Pick Qwen 2.5 Coder 32B for production API-scale code generation or CI-integrated completion pipelines. Pick StarCoder2 15B when your legal team requires verifiable training-data provenance over peak benchmark scores.
Compare two at a time
Frequently asked questions
- How does Codestral 22b compare to Qwen 2.5 Coder 32b Instruct and Starcoder2 15b Instruct on price?
- Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
- Which model is best for coding: Codestral 22b, Qwen 2.5 Coder 32b Instruct, or Starcoder2 15b Instruct?
- HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
- What is the context window for Codestral 22b, Qwen 2.5 Coder 32b Instruct, and Starcoder2 15b Instruct?
- Context window sizes are listed in the Specs row of the comparison table above.
Full model details