Model crosswalk
Side-by-side on price, capability and workload. Both columns use the cheapest provider for that model.
Yi 1.5 34b Chat
vs
Yi 1.5 9b Chat
Yi 1.5 34b ChatA
Yi 1.5 34b Chat
Cheapest provider—
$/1M input—
$/1M output—
Yi 1.5 9b ChatB
Yi 1.5 9b Chat
Cheapest provider—
$/1M input—
$/1M output—
Specs and cheapest providers
| Spec | Yi 1.5 34b Chat | Yi 1.5 9b Chat |
|---|---|---|
| Parameters | — | — |
| Context window | — | — |
| License | — | — |
| Released | — | — |
| Cheapest provider | ||
| Provider | — | — |
| Input / 1M tokens | — | — |
| Output / 1M tokens | — | — |
Add a third model to compare
Benchmark comparison
No benchmark data available for either model yet.
Sample workload — 5M in + 2M out per month
using each model's cheapest providerWhat changes at scale
Output tokens dominate cost above a 1:3 input/output ratio. Below 1:1, input dominates and cheaper-input providers win regardless of headline price.
1M in · 250K out$0.00 · $0.00
5M in · 2M out$0.00 · $0.00
20M in · 10M out$0.00 · $0.00
100M in · 60M out$0.00 · $0.00
Capability vs price
scatter// scatter: benchmark × $/1M out
Calculate cost for your workload
Compare total monthly cost across providers for Yi 1.5 34b Chat and Yi 1.5 9b Chat using your own input/output token mix.
Open workload calculator →Editor's take
Within the Yi 1.5 family, the 34B and 9B models share architecture and training data but differ in parameter count — and that gap shows up in both cost and quality. [Yi 1.5 34B Chat](/models/01-ai--yi-1.5-34b-chat) typically costs $0.25–0.40/M tokens versus $0.03–0.07/M for the 9B, a 5–8× cost ratio that makes model selection a straightforward cost-quality calculation.
On MMLU, the 34B scores approximately 76–78% versus the 9B's 65–68% — a 10+ point gap. Multi-hop reasoning, longer context coherence, and complex instruction adherence all improve substantially with the larger model. For tasks that require holding context across a full document (8K+ tokens) or following multi-constraint instructions, the 34B is noticeably more reliable.
[Yi 1.5 9B Chat](/models/01-ai--yi-1.5-9b-chat) handles single-turn Q&A, classification, summarization of short documents, and low-latency chatbot responses efficiently. Its memory footprint (fits on a single A10G with room to spare) means better throughput per dollar on shared GPU infrastructure, and cold-start latency is significantly lower than the 34B.
Both models carry 01.AI's strong Mandarin Chinese training, so the language-quality advantage is less of a differentiator within the family; the choice reduces to task complexity and budget.
Pick Yi 1.5 9B Chat for high-volume, cost-sensitive workloads where single-turn quality in the 65–68% MMLU range is acceptable. Pick Yi 1.5 34B Chat when multi-step reasoning, longer context, or higher accuracy justify the 5–8× token cost increase.
Related comparisons
Full model details