Hyperbolic — models and pricing
Hyperbolic hosts a broad open-weights catalog spanning the Llama, Qwen, DeepSeek, and Mistral families on commodity GPU infrastructure. Pricing is per-token and competitive — typically at or below the median for shared open-weights inference — with no per-tier feature gating between free-trial and paid usage.
The strength is breadth and rate parity: a single API key reaches most production-grade chat models without juggling multiple provider integrations, and the same per-million-token rate often covers Llama 3.3 70B, Qwen 3 72B, and Mixtral 8x22B. Throughput is solid for a GPU-backed host but lower than specialized hardware like Cerebras or Groq.
Best as a default open-weights host when model variety matters more than peak per-model speed, or for teams that want one billing relationship across the open-weights landscape.
Model catalog
11 models| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| Qwen 2.5 72B Instruct | $0.4000 | $0.4000 | 131k |
| Qwen 3 72B Instruct | $0.4000 | $0.4000 | 131k |
| DeepSeek R1 | $2.0000 | $2.0000 | 131k |
| DeepSeek R1 Distill Llama 70B | $0.2000 | $0.2000 | 131k |
| DeepSeek V3 | $0.2500 | $0.2500 | 131k |
| Llama 3.1 405B Instruct | $4.0000 | $4.0000 | 131k |
| Llama 3.1 70B Instruct | $0.4000 | $0.4000 | 131k |
| Llama 3.1 8B Instruct | $0.1000 | $0.1000 | 131k |
| Llama 3.3 70B Instruct | $0.4000 | $0.4000 | 131k |
| Mixtral 8x22B Instruct | $0.6000 | $0.6000 | 66k |
| Mixtral 8x7B Instruct | $0.5000 | $0.5000 | 33k |
Calculate cost for your workload
Plug in your monthly tokens — get the actual bill on every provider serving each model.
Open calculator