Llama 3.1 70B Instruct — pricing, providers, and benchmarks

Parameters
70B
Context window
131k tokens
License
llama-3
Released
2024-07-23

Provider pricing

Sorted by total monthly cost for 100M input + 10M output tokens.

ProviderInput / 1MOutput / 1MMonthly costContext
DeepInfra$0.2300$0.4000$27.00131k
Fireworks AI$0.2200$0.8800$30.80131k
Groq$0.5900$0.7900$66.90131k

Frequently asked questions

How much does it cost to run Llama 3.1 70B Instruct for 100M tokens?

Running Llama 3.1 70B Instruct with 100M input and 10M output tokens per month costs approximately $27.00 on DeepInfra, the cheapest available provider as of the latest pricing data. Costs vary significantly depending on your input/output ratio and whether you use prompt caching.

What is the cheapest provider for Llama 3.1 70B Instruct?

DeepInfra currently offers Llama 3.1 70B Instruct at the lowest total cost for a standard workload. Prices change frequently — check the table above for the latest data.

What context window does Llama 3.1 70B Instruct support?

Llama 3.1 70B Instruct supports a context window of 131,072 tokens. Individual providers may cap this lower — see the pricing table for per-provider context limits.