0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Granite 3.1 2B Instruct
vs
Llama 3.2 1B Instruct
vs
Stable Code Instruct 3B
Granite 3.1 2B InstructA

Granite 3.1 2B Instruct

2B params · 131K context · apache-2.0

Cheapest provider
$/1M input
$/1M output
Llama 3.2 1B InstructB

Llama 3.2 1B Instruct

1B params · 131K context · llama-3

Cheapest provider
$/1M input
$/1M output
Stable Code Instruct 3BC

Stable Code Instruct 3B

3B params · 16K context · stability-ai-nc-community

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecGranite 3.1 2B InstructLlama 3.2 1B InstructStable Code Instruct 3B
Parameters2B1B3B
Context window131K tokens131K tokens16K tokens
Licenseapache-2.0llama-3stability-ai-nc-community
Released2024-12-192024-09-252024-01-11
Cheapest provider
Provider
Input / 1M tokens
Output / 1M tokens
Benchmark comparison

No benchmark data available yet.

Editor's take
Three models at the very small end of the parameter spectrum — one enterprise-tuned, one designed for mobile, and one that has been effectively retired by the market. Granite 3.1 2B Instruct is IBM's smallest production Granite model, released as part of the Granite 3 series with 2 billion parameters and a 128K context window. That context depth at 2B scale is uncommon and gives it a real edge over Llama 3.2 3B and Gemma 2 2B for long-document classification and structured extraction tasks. IBM designed Granite 3 for enterprise compliance workflows, tool use, and structured output rather than creative generation. Apache 2.0 license is royalty-free for commercial deployment. Primary hosting on watsonx.ai, with growing coverage on third-party inference providers. Llama 3.2 1B Instruct is Meta's smallest Llama model, released September 2024 under the Llama 3 community license. At 1 billion parameters, its primary design target is on-device mobile inference — phones and edge hardware where 3B or 8B models exceed memory budgets. Quality ceilings are low: summarization, complex instruction following, and coding are not viable at this size. The model is useful for latency profiling at the smallest weight class, or for truly constrained triage pipelines. Hosted API pricing is available at sub-$0.05 per million tokens on several platforms. Stable Code Instruct 3B from Stability AI, released January 2024, targeted single-file code fill-in-middle completions at 3B parameters with a 16K context window. By mid-2026 it has no remaining production case: Qwen 2.5 Coder and DeepSeek Coder variants deliver materially better HumanEval scores at similar or lower hosted cost, and Stability AI's non-commercial community license requires a paid membership for production use — friction no team accepts when Apache-licensed alternatives exist at 7B scale. Pick Granite 3.1 2B for enterprise extraction and long-document classification with an Apache 2.0 license. Pick Llama 3.2 1B for on-device inference on mobile or edge hardware where memory budgets are extreme. Skip Stable Code 3B for any new production deployment.
Compare two at a time
Frequently asked questions
How does Granite 3.1 2B Instruct compare to Llama 3.2 1B Instruct and Stable Code Instruct 3B on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Granite 3.1 2B Instruct, Llama 3.2 1B Instruct, or Stable Code Instruct 3B?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Granite 3.1 2B Instruct, Llama 3.2 1B Instruct, and Stable Code Instruct 3B?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details