0 providers50 models
Cheapest hosting · live

LLM leaderboards

Objective rankings for every dimension that matters in production: input/output token pricing, blended monthly cost, time-to-first-token, throughput, context window, and benchmarks. Scraped nightly — no estimates.Last updated May 2026.

Browse by dimension

9 surfaces
leaderboard

Cheapest LLM Input Price

Find the most cost-effective models for prompt-heavy workloads. Ranked by the lowest input token price across all providers, updated nightly from live scrapes.

View rankings →
leaderboard

Cheapest LLM Output Price

Output tokens dominate cost for generation-heavy use cases. This leaderboard ranks models by the lowest output token price across all providers.

View rankings →
leaderboard

Cheapest Blended LLM Cost

Blended cost for a workload of 100M input tokens and 10M output tokens per month — the most realistic cost-of-ownership comparison for most production applications.

View rankings →
leaderboard

Fastest LLM Time to First Token

Time to first token (TTFT) determines how quickly a response starts streaming to your users. Lower is better. Values are the best-published TTFT per model across providers.

View rankings →
leaderboard

Highest LLM Throughput (tok/s)

Throughput (tokens per second) determines how fast a model generates output. Critical for batch workloads and applications where generation speed matters.

View rankings →
leaderboard

Longest LLM Context Window

Context window determines how much text a model can process in a single call — essential for document summarisation, long-form coding, and RAG pipelines.

View rankings →
leaderboard

Best LLM MMLU Score

MMLU (Massive Multitask Language Understanding) measures reasoning and knowledge across 57 subjects. Higher is better. Scores sourced from published model cards and papers.

View rankings →
leaderboard

Best LLM HumanEval Score

HumanEval measures code-generation ability: the percentage of coding problems solved correctly (pass@1). Higher is better. Sourced from published evals.

View rankings →
leaderboard

Most Available LLM Providers

Provider count indicates ecosystem breadth and supply-side competition. Models available on more providers are less likely to suffer downtime or rate-limit bottlenecks.

View rankings →

Best overall — cheapest blended cost

Ranked by total monthly cost for a 100M input / 10M output workload.

18 models
#ModelFamilyBlended monthly costProvidersLast updated
01Gemma 2 9B ITgemma$0.06/mo3May 16
02Llama 3.1 8B Instructllama$0.06/mo4May 16
03Mistral Small 3mistral$0.13/mo1May 16
04Qwen 2.5 Coder 32B Instructqwen$0.14/mo1May 16
05Qwen 3 32B Instructqwen$0.20/mo2May 16
06Qwen 2.5 72B Instructqwen$0.21/mo3May 16
07Mixtral 8x7B Instructmixtral$0.22/mo2May 16
08Qwen 3 72B Instructqwen$0.25/mo4May 17
09Llama 3.3 70B Instructllama$0.26/mo5May 17
10Llama 3.1 70B Instructllama$0.26/mo3May 16
11DeepSeek V3deepseek$0.28/mo3May 16
12DeepSeek R1 Distill Llama 70Bdeepseek$0.34/mo3May 16
13DeepSeek V3.2deepseek$0.38/mo1May 17
14DeepSeek R1deepseek$0.60/mo3May 16
15Mixtral 8x22B Instructmixtral$0.67/mo4May 17
16Mistral Large 2mistral$2.34/mo1May 16
17Llama 3.1 405B Instructllama$3.05/mo4May 17
18Command R+command-r$3.50/mo1May 16