Gemma 2 27B IT vs Mistral Small 3 (2026) — pricing, benchmarks, cheapest providers

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

Gemma 2 27B IT

Mistral Small 3

Gemma 2 27B ITA

Gemma 2 27B IT

27B params · 8K context · gemma

Cheapest provider—

$/1M input—

$/1M output—

Mistral Small 3B

Mistral Small 3

24B params · 33K context · apache-2.0

Cheapest provideropenrouter

$/1M input$100000.00

$/1M output$300000.00

Specs and cheapest providers

Spec	Gemma 2 27B IT	Mistral Small 3
Parameters	27B	24B
Context window	8K tokens	33K tokens🏆
License	gemma	apache-2.0
Released	2024-07-31	2025-01-30
Cheapest provider
Provider	—	openrouter
Input / 1M tokens	—	$100000.00
Output / 1M tokens	—	$300000.00

#3 Mistral Small 3 in cheapest input #6 Mistral Small 3 in cheapest output

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider

Gemma 2 27B IT

$0.00 /mo

Mistral Small 3

$1100000.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$0.00 · $175000.00

5M in · 2M out$0.00 · $1100000.00

20M in · 10M out$0.00 · $5000000.00

100M in · 60M out$0.00 · $28000000.00

Capability vs price

scatter

// scatter: benchmark × $/1M out

Calculate cost for your workload

Compare total monthly cost across providers for Gemma 2 27B IT and Mistral Small 3 using your own input/output token mix.

Open workload calculator →

Editor's take

[Gemma 2 27B IT](/models/google--gemma-2-27b-it) is Google DeepMind's instruction-tuned 27B dense model from the Gemma 2 family, trained with knowledge distillation from larger models and interleaved sliding-window attention. [Mistral Small 3](/models/mistralai--mistral-small-3) sits in the sub-25B tier with Mistral's characteristic efficiency focus — low latency, high throughput, and competitive reasoning at the small-model price point. Both models target the cost-conscious segment, with typical hosted rates under $0.20/M input tokens at competitive providers. Gemma 2 27B IT has a slight size advantage and frequently leads on knowledge benchmarks (MMLU ~74–76%) due to distillation benefits from larger Gemma models. Mistral Small 3 is generally faster to first token on providers with dedicated Mistral infrastructure. Gemma 2 27B IT performs well on document understanding and summarization tasks where its sliding-window attention handles long contexts more coherently than vanilla attention at the same scale. It also benefits from Google's safety tuning investments, making it a reliable choice for consumer-facing applications with stricter content filtering requirements. Mistral Small 3 earns its position on function-calling and structured-output pipelines — Mistral has prioritized tool-use reliability across its model family, and Small 3 inherits those improvements. For high-volume API pipelines where you're calling the model hundreds of times per minute and need reliable JSON extraction or tool dispatch, Mistral Small 3's lower latency and consistent structured-output behavior reduce pipeline error rates. Pick Gemma 2 27B IT if knowledge-intensive tasks, long document handling, or Google's safety-tuning profile matches your application requirements. Pick Mistral Small 3 if first-token latency, function-calling reliability, and high-throughput structured-output pipelines are the primary requirements.

Related comparisons

Gemma 2 27b It vs Qwen 3 32b Instruct →Mistral Small 3 vs Qwen 3 32b Instruct →Gemma 2 27b It vs Solar Pro 22b →Gemma 2 27b It vs Yi 1.5 34b Chat →

Full model details

All providers for Gemma 2 27B IT →All providers for Mistral Small 3 →