Mistral Small 3 — pricing, providers, and benchmarks

Parameters
24B
Context window
33k tokens
License
apache-2.0
Released
2025-01-30

Mistral Small 3 (24B parameters, 32K context, released early 2025) is Mistral's answer to the "small-but-not-tiny" gap between 8B speed-tier models and 70B quality-tier models. It outperforms Llama 3.1 8B on most benchmarks while running at 2-3x the per-token cost, making it the right pick when 8B's accuracy ceiling becomes a problem but 70B is overkill. Pricing on hosted providers usually sits at $0.10–$0.20 per 1M tokens — affordable enough for high-volume use. Strong choice for production chatbots that need to handle some ambiguous queries, mid-tier RAG, or workflow agents that occasionally need to call tools. Apache 2.0 licensed.

Provider pricing

Sorted by total monthly cost for 100M input + 10M output tokens.

ProviderInput / 1MOutput / 1MMonthly costContext
OpenRouter$0.1000$0.3000$13.0033k

Frequently asked questions

How much does it cost to run Mistral Small 3 for 100M tokens?

Running Mistral Small 3 with 100M input and 10M output tokens per month costs approximately $13.00 on OpenRouter, the cheapest available provider as of the latest pricing data. Costs vary significantly depending on your input/output ratio and whether you use prompt caching.

What is the cheapest provider for Mistral Small 3?

OpenRouter currently offers Mistral Small 3 at the lowest total cost for a standard workload. Prices change frequently — check the table above for the latest data.

What context window does Mistral Small 3 support?

Mistral Small 3 supports a context window of 32,768 tokens. Individual providers may cap this lower — see the pricing table for per-provider context limits.