Model crosswalk
Side-by-side on price, capability and workload — three-way comparison.
DBRX Instruct
vs
Mixtral 8x22B Instruct
vs
WizardLM-2 8x22B
DBRX InstructA
DBRX Instruct
132B params · 33K context · databricks-open-model
Cheapest provider—
$/1M input—
$/1M output—
Mixtral 8x22B InstructB
Mixtral 8x22B Instruct
141B params · 66K context · apache-2.0
Cheapest provider—
$/1M input—
$/1M output—
WizardLM-2 8x22BC
WizardLM-2 8x22B
141B params · 66K context · wizardlm-2-community
Cheapest provider—
$/1M input—
$/1M output—
Specs and cheapest providers
| Spec | DBRX Instruct | Mixtral 8x22B Instruct | WizardLM-2 8x22B |
|---|---|---|---|
| Parameters | 132B | 141B | 141B |
| Context window | 33K tokens | 66K tokens | 66K tokens |
| License | databricks-open-model | apache-2.0 | wizardlm-2-community |
| Released | 2024-03-27 | 2024-04-17 | 2024-04-15 |
| Cheapest provider | |||
| Provider | — | — | — |
| Input / 1M tokens | — | — | — |
| Output / 1M tokens | — | — | — |
Benchmark comparison
No benchmark data available yet.
Editor's take
Three MoE instruction models from 2024, all routing 36–42B active parameters per token, each representing a different publisher's approach to large open-weights inference. DBRX Instruct from Databricks launched March 2024 with 132B total parameters and 16 fine-grained experts, beating Mixtral 8x7B on most benchmarks at the time. The 32K context window is its main limitation relative to the other two here. The Databricks Open Model License is permissive for most commercial use but not OSI-approved — worth a legal review before deployment. Its clearest remaining use case is teams running in the Databricks ecosystem who want a native Databricks model for integration reasons.
Mixtral 8x22B Instruct is Mistral's 141B-total-parameter release from April 2024, with roughly 39B active per pass and a 64K context window. It was a significant step up from Mixtral 8x7B on reasoning and multilingual tasks, and it remains the base architecture that WizardLM-2 8x22B builds on. The Apache 2.0 license is the cleanest of the three for commercial deployment and redistribution.
WizardLM-2 8x22B is Microsoft Research's April 2024 fine-tune of the Mixtral 8x22B base using Evol-Instruct, which produced measurably stronger multi-turn conversational benchmark scores at release. The WizardLM 2 Community License carries non-standard attribution clauses that require review before commercial deployment. At 64K context and the same active parameter count, it slots in above the base Mixtral 8x22B on conversational quality.
Pick DBRX when you are already in the Databricks stack. Pick Mixtral 8x22B when Apache licensing and self-hosting flexibility matter. Pick WizardLM-2 8x22B for multi-turn conversational tasks where you can accommodate the attribution requirements.
Compare two at a time
Frequently asked questions
- How does DBRX Instruct compare to Mixtral 8x22B Instruct and WizardLM-2 8x22B on price?
- Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
- Which model is best for coding: DBRX Instruct, Mixtral 8x22B Instruct, or WizardLM-2 8x22B?
- HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
- What is the context window for DBRX Instruct, Mixtral 8x22B Instruct, and WizardLM-2 8x22B?
- Context window sizes are listed in the Specs row of the comparison table above.
Full model details