Model crosswalk
Side-by-side on price, capability and workload — three-way comparison.
Gemma 2 27B IT
vs
Llama 3.3 70B Instruct
vs
Qwen 3 32B Instruct
Gemma 2 27B ITA
Gemma 2 27B IT
27B params · 8K context · gemma
Cheapest provider—
$/1M input—
$/1M output—
Llama 3.3 70B InstructB
Llama 3.3 70B Instruct
70B params · 131K context · llama-3
Cheapest providerfireworks-ai
$/1M input$220000.00
$/1M output$880000.00
Qwen 3 32B InstructC
Qwen 3 32B Instruct
32B params · 131K context · qwen
Cheapest provideropenrouter
$/1M input$140000.00
$/1M output$550000.00
Specs and cheapest providers
| Spec | Gemma 2 27B IT | Llama 3.3 70B Instruct | Qwen 3 32B Instruct |
|---|---|---|---|
| Parameters | 27B | 70B | 32B |
| Context window | 8K tokens | 131K tokens | 131K tokens |
| License | gemma | llama-3 | qwen |
| Released | 2024-07-31 | 2024-12-06 | 2025-04-28 |
| Cheapest provider | |||
| Provider | — | fireworks-ai | openrouter |
| Input / 1M tokens | — | $220000.00 | $140000.00🏆 |
| Output / 1M tokens | — | $880000.00 | $550000.00🏆 |
Benchmark comparison
No benchmark data available yet.
Editor's take
A compact dense model from Google, a proven 70B workhorse from Meta, and a multilingual mid-tier from Alibaba. Gemma 2 27B IT is Google DeepMind's July 2024 model, distilled from Gemini Ultra training data with competitive MT-Bench scores for its 27B parameter class. The instruction-following quality is reliable in single-turn and short-context settings, and the Gemma license permits commercial use. The hard limit is the 8K context ceiling: anything involving long documents, multi-turn memory, or RAG over large corpora is disqualified. For use cases where 8K is adequate, it delivers strong quality at a sub-70B cost.
Llama 3.3 70B Instruct doubled the context to 131K in Meta's December 2024 release, which resolves the primary constraint. General instruction-following, summarization, and classification all improve noticeably over 3.1 70B. The Llama 3 community license is permissive and broadly understood. Provider coverage is the widest of any model in this tier. At the 70B parameter scale, it handles tasks where Gemma 2 27B begins to struggle with complexity.
Qwen 3 32B Instruct from Alibaba, released April 2025, sits between these two on parameter count with a 131K context window and roughly 85 percent of Qwen 3 72B benchmark quality. Its core advantage over both competitors is multilingual breadth — Chinese, Japanese, Korean, and Arabic quality is noticeably better than Llama 3.3 70B and far beyond Gemma 2 27B. The Qwen commercial license covers production use.
Pick Gemma 2 27B for workloads where 8K context is sufficient and you want the smallest footprint. Pick Llama 3.3 70B for long-context general tasks with permissive licensing. Pick Qwen 3 32B when multilingual coverage matters and you want a cost middle-ground between 27B and 70B options.
Compare two at a time
Frequently asked questions
- How does Gemma 2 27B IT compare to Llama 3.3 70B Instruct and Qwen 3 32B Instruct on price?
- Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
- Which model is best for coding: Gemma 2 27B IT, Llama 3.3 70B Instruct, or Qwen 3 32B Instruct?
- HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
- What is the context window for Gemma 2 27B IT, Llama 3.3 70B Instruct, and Qwen 3 32B Instruct?
- Context window sizes are listed in the Specs row of the comparison table above.
Full model details