0 providers50 models

Model crosswalk

Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.

Yi 1.5 34b Chat
vs
Yi 1.5 9b Chat
Yi 1.5 34b ChatA

Yi 1.5 34b Chat

Cheapest provider
$/1M input
$/1M output
Yi 1.5 9b ChatB

Yi 1.5 9b Chat

Cheapest provider
$/1M input
$/1M output
Specs and cheapest providers
SpecYi 1.5 34b ChatYi 1.5 9b Chat
Parameters
Context window
License
Released
Cheapest provider
Provider
Input / 1M tokens
Output / 1M tokens

Add a third model to compare

Benchmark comparison

No benchmark data available for either model yet.

Sample workload — 5M in + 2M out per month

using each model's cheapest provider
Yi 1.5 34b Chat
$0.00 /mo
Yi 1.5 9b Chat
$0.00 /mo

What changes at scale

Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.

1M in · 250K out$0.00 · $0.00
5M in · 2M out$0.00 · $0.00
20M in · 10M out$0.00 · $0.00
100M in · 60M out$0.00 · $0.00

Capability vs price

scatter
// scatter: benchmark × $/1M out
Calculate cost for your workload

Compare total monthly cost across providers for Yi 1.5 34b Chat and Yi 1.5 9b Chat using your own input/output token mix.

Open workload calculator →
Editor's take
Within the Yi 1.5 family, the 34B and 9B models share architecture and training data but differ in parameter count — and that gap shows up in both cost and quality. [Yi 1.5 34B Chat](/models/01-ai--yi-1.5-34b-chat) typically costs $0.25–0.40/M tokens versus $0.03–0.07/M for the 9B, a 5–8× cost ratio that makes model selection a straightforward cost-quality calculation. On MMLU, the 34B scores approximately 76–78% versus the 9B's 65–68% — a 10+ point gap. Multi-hop reasoning, longer context coherence, and complex instruction adherence all improve substantially with the larger model. For tasks that require holding context across a full document (8K+ tokens) or following multi-constraint instructions, the 34B is noticeably more reliable. [Yi 1.5 9B Chat](/models/01-ai--yi-1.5-9b-chat) handles single-turn Q&A, classification, summarization of short documents, and low-latency chatbot responses efficiently. Its memory footprint (fits on a single A10G with room to spare) means better throughput per dollar on shared GPU infrastructure, and cold-start latency is significantly lower than the 34B. Both models carry 01.AI's strong Mandarin Chinese training, so the language-quality advantage is less of a differentiator within the family; the choice reduces to task complexity and budget. Pick Yi 1.5 9B Chat for high-volume, cost-sensitive workloads where single-turn quality in the 65–68% MMLU range is acceptable. Pick Yi 1.5 34B Chat when multi-step reasoning, longer context, or higher accuracy justify the 5–8× token cost increase.
Related comparisons
Full model details