Llama 3.1 70B Instruct — pricing, providers, and benchmarks
Parameters
70B
Context window
131k tokens
License
llama-3
Released
2024-07-23
Provider pricing
Sorted by total monthly cost for 100M input + 10M output tokens.
| Provider | Input / 1M | Output / 1M | Monthly cost | Context |
|---|---|---|---|---|
| DeepInfra | $0.2300 | $0.4000 | $27.00 | 131k |
| Fireworks AI | $0.2200 | $0.8800 | $30.80 | 131k |
| Groq | $0.5900 | $0.7900 | $66.90 | 131k |
Frequently asked questions
How much does it cost to run Llama 3.1 70B Instruct for 100M tokens?▾
Running Llama 3.1 70B Instruct with 100M input and 10M output tokens per month costs approximately $27.00 on DeepInfra, the cheapest available provider as of the latest pricing data. Costs vary significantly depending on your input/output ratio and whether you use prompt caching.
What is the cheapest provider for Llama 3.1 70B Instruct?▾
DeepInfra currently offers Llama 3.1 70B Instruct at the lowest total cost for a standard workload. Prices change frequently — check the table above for the latest data.
What context window does Llama 3.1 70B Instruct support?▾
Llama 3.1 70B Instruct supports a context window of 131,072 tokens. Individual providers may cap this lower — see the pricing table for per-provider context limits.