Hyperbolic

Verified May 27, 2026

11 models hosted · from $0.10/1M

Hyperbolic hosts a broad open-weights catalog spanning the Llama, Qwen, DeepSeek, and Mistral families on commodity GPU infrastructure. Pricing is per-token and competitive — typically at or below the median for shared open-weights inference — with no per-tier feature gating between free-trial and paid usage.

Go to Hyperbolic ↗

The strength is breadth and rate parity: a single API key reaches most production-grade chat models without juggling multiple provider integrations, and the same per-million-token rate often covers Llama 3.3 70B, Qwen 3 72B, and Mixtral 8x22B. Throughput is solid for a GPU-backed host but lower than specialized hardware like Cerebras or Groq.

Best as a default open-weights host when model variety matters more than peak per-model speed, or for teams that want one billing relationship across the open-weights landscape.

Model hosted$/1M in$/1M outContext

Qwen 2.5 72B Instruct

$0.40$0.40131Kmodel page →

Qwen 3 72B Instruct

$0.40$0.40131Kmodel page →

DeepSeek R1

$2.00$2.00131Kmodel page →

DeepSeek R1 Distill Llama 70B

$0.20$0.20131Kmodel page →

DeepSeek V3

$0.25$0.25131Kmodel page →

Llama 3.1 405B Instruct