How does Codestral 22B compare to Qwen 2.5 Coder 7B Instruct and StarCoder2 15B Instruct on price?

Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.

Which model is best for coding: Codestral 22B, Qwen 2.5 Coder 7B Instruct, or StarCoder2 15B Instruct?

HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.

What is the context window for Codestral 22B, Qwen 2.5 Coder 7B Instruct, and StarCoder2 15B Instruct?

Context window sizes are listed in the Specs row of the comparison table above.

Codestral 22b vs Qwen 2.5 Coder 7b Instruct vs Starcoder2 15b Instruct (2026) — 3-way comparison

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Codestral 22B

Qwen 2.5 Coder 7B Instruct

StarCoder2 15B Instruct

Codestral 22BA

Codestral 22B

22B params · 33K context · mistral-research

Cheapest provider—

$/1M input—

$/1M output—

Qwen 2.5 Coder 7B InstructB

Qwen 2.5 Coder 7B Instruct

7B params · 131K context · qwen

Cheapest provider—

$/1M input—

$/1M output—

StarCoder2 15B InstructC

StarCoder2 15B Instruct

15B params · 16K context · bigcode-openrail-m

Cheapest provider—

$/1M input—

$/1M output—

Specs and cheapest providers

Spec	Codestral 22B	Qwen 2.5 Coder 7B Instruct	StarCoder2 15B Instruct
Parameters	22B	7B	15B
Context window	33K tokens	131K tokens🏆	16K tokens
License	mistral-research	qwen	bigcode-openrail-m
Released	2024-05-29	2024-11-12	2024-09-06
Cheapest provider
Provider	—	—	—
Input / 1M tokens	—	—	—
Output / 1M tokens	—	—	—

Benchmark comparison

No benchmark data available yet.

Editor's take

A direct comparison of three coding models at different parameter counts, from two commercially permissive options and one with a clear production blocker. Codestral 22B is Mistral AI's first code-specialist model, released May 2024. It covers 80-plus programming languages with a 32K context window, and on HumanEval it competes with DeepSeek Coder V2 Lite and Qwen 2.5 Coder 7B. The 22B parameter count puts it above typical autocomplete models in quality, but the Mistral Research License prohibits commercial production deployment without a direct agreement with Mistral. This is a genuine constraint: teams often benchmark Codestral favorably and then hit this wall during legal review. Qwen 2.5 Coder 7B Instruct, released November 2024, has HumanEval scores competitive with DeepSeek Coder 6.7B and StarCoder2 7B and a 131K context window at 7 billion parameters — generous context for its size tier. Hosted pricing typically runs below $0.20 per million tokens. The Qwen license permits commercial deployment. For teams looking for an autocomplete backbone that can also pass meaningful multi-file context, this is where most evaluation processes land. StarCoder2 15B Instruct sits between the other two in parameter count at 15B, with a 16K context window from the BigCode collaboration (HuggingFace and ServiceNow). It trails both Codestral and Qwen 2.5 Coder on raw benchmarks, so it doesn't win on performance. Its differentiated case is training-data auditability: The Stack v2 training corpus is restricted to permissively licensed code, and that chain of custody matters to teams in regulated sectors. BigCode OpenRAIL-M is commercially usable under narrow restrictions. Pick Codestral 22B for internal research or non-commercial tooling. Pick Qwen 2.5 Coder 7B for commercial production APIs at competitive token economics. Pick StarCoder2 15B when IP audit requirements demand verifiable training-data provenance.

Compare two at a time

Codestral 22B vs Qwen 2.5 Coder 7B Instruct Codestral 22B vs StarCoder2 15B Instruct Qwen 2.5 Coder 7B Instruct vs StarCoder2 15B Instruct

Frequently asked questions

How does Codestral 22B compare to Qwen 2.5 Coder 7B Instruct and StarCoder2 15B Instruct on price?: Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Codestral 22B, Qwen 2.5 Coder 7B Instruct, or StarCoder2 15B Instruct?: HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Codestral 22B, Qwen 2.5 Coder 7B Instruct, and StarCoder2 15B Instruct?: Context window sizes are listed in the Specs row of the comparison table above.

Full model details

All providers for Codestral 22B →All providers for Qwen 2.5 Coder 7B Instruct →All providers for StarCoder2 15B Instruct →