0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload — three-way comparison.

Gemma 2 9B IT
vs
Granite 3.1 8B Instruct
vs
Mistral 7B Instruct v0.3
Gemma 2 9B ITA

Gemma 2 9B IT

9B params · 8K context · gemma

Cheapest providerdeepinfra
$/1M input$50000.00
$/1M output$60000.00
Granite 3.1 8B InstructB

Granite 3.1 8B Instruct

8B params · 131K context · apache-2.0

Cheapest provider
$/1M input
$/1M output
Mistral 7B Instruct v0.3C

Mistral 7B Instruct v0.3

7B params · 33K context · apache-2.0

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecGemma 2 9B ITGranite 3.1 8B InstructMistral 7B Instruct v0.3
Parameters9B8B7B
Context window8K tokens131K tokens🏆33K tokens
Licensegemmaapache-2.0apache-2.0
Released2024-07-312024-12-192024-05-22
Cheapest provider
Providerdeepinfra
Input / 1M tokens$50000.00
Output / 1M tokens$60000.00
Benchmark comparison

No benchmark data available yet.

Editor's take
Gemma 2 9B IT, Granite 3.1 8B Instruct, and Mistral 7B Instruct v0.3 are all sub-10B instruction-tuned models from mid-2024 that remain in active use. They are close enough in capability that the decision between them usually comes down to context window requirements, licensing constraints, and which provider ecosystem you are already in. Mistral 7B Instruct v0.3 is the Apache 2.0 baseline here. The 32K context window handles most summarization and classification workloads, and pricing typically falls below $0.10 per million tokens. The v0.3 update added native function-calling support, making it a complete enough foundation for lightweight agentic pipelines. For teams that need the cleanest possible open-source licensing for fine-tuning or redistribution, this is the most permissive and mature option in the group. Gemma 2 9B IT from Google DeepMind is a 9-billion-parameter model released July 2024 and distilled from Gemini Ultra training data. On standard benchmarks it runs comparably to Mistral 7B and Llama 3.1 8B, with Groq hosting offering some of the lowest latency numbers for this size class. The 8K context ceiling is a harder constraint than either of the other two models here. The Gemma license is permissive for commercial use but not OSI-approved. Granite 3.1 8B Instruct from IBM, released December 2024 as an Apache 2.0 model, upgraded context from 4K to 128K tokens in the 3.1 revision — the longest window in this comparison by a significant margin. IBM tuned it for structured-output and tool-use scenarios, and enterprise RAG pipelines that need to ingest long documents in a single 8B-class pass have a concrete reason to evaluate it. Primary hosting is IBM's watsonx.ai. Pick Mistral 7B v0.3 for the best Apache 2.0 general-purpose baseline. Pick Gemma 2 9B for low-latency short-context inference on Groq. Pick Granite 3.1 8B when 128K context at sub-10B parameter cost is the deciding requirement.
Compare two at a time
Frequently asked questions
How does Gemma 2 9B IT compare to Granite 3.1 8B Instruct and Mistral 7B Instruct v0.3 on price?
Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
Which model is best for coding: Gemma 2 9B IT, Granite 3.1 8B Instruct, or Mistral 7B Instruct v0.3?
HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
What is the context window for Gemma 2 9B IT, Granite 3.1 8B Instruct, and Mistral 7B Instruct v0.3?
Context window sizes are listed in the Specs row of the comparison table above.
Full model details