Model crosswalk
Side-by-side on price, capability and workload — three-way comparison.
Hermes 3 Llama 3.1 405b
vs
Hermes 3 Llama 3.1 70b
vs
Llama 3.3 70b Instruct
Hermes 3 Llama 3.1 405bA
Hermes 3 Llama 3.1 405b
Cheapest provider—
$/1M input—
$/1M output—
Hermes 3 Llama 3.1 70bB
Hermes 3 Llama 3.1 70b
Cheapest provider—
$/1M input—
$/1M output—
Llama 3.3 70b InstructC
Llama 3.3 70b Instruct
Cheapest provider—
$/1M input—
$/1M output—
Specs and cheapest providers
| Spec | Hermes 3 Llama 3.1 405b | Hermes 3 Llama 3.1 70b | Llama 3.3 70b Instruct |
|---|---|---|---|
| Parameters | — | — | — |
| Context window | — | — | — |
| License | — | — | — |
| Released | — | — | — |
| Cheapest provider | |||
| Provider | — | — | — |
| Input / 1M tokens | — | — | — |
| Output / 1M tokens | — | — | — |
Benchmark comparison
No benchmark data available yet.
Editor's take
Nous Research's Hermes 3 Llama 3.1 405B and Hermes 3 Llama 3.1 70B are fine-tunes of Meta's respective base models, both released August 2024. Llama 3.3 70B Instruct is Meta's December 2024 update to the 70B tier. All three carry the Llama 3 community license and 131K context windows. The key comparison is between Nous's fine-tuning approach — emphasizing reasoning traces, persona fidelity, and reduced refusals — and Meta's own improved instruct alignment in the 3.3 generation.
Hermes 3 70B is a near-zero-cost swap from Llama 3.1 70B Instruct for teams running agent frameworks. The Nous fine-tuning recipe adds XML-tagged reasoning traces, tighter tool-schema adherence, and reduced RLHF softening of system prompt personas. For structured-output pipelines, multi-turn role-adherence evals, and agentic task chains where the base Llama instruct version hedges too aggressively, Hermes 3 70B is a practical improvement.
Llama 3.3 70B Instruct is Meta's own answer to improving alignment at the 70B tier. Released December 2024, it delivers better instruction-following and multi-turn coherence than the 3.1 70B baseline. Teams comparing Hermes 3 70B against Llama 3.3 70B should benchmark directly on their task distribution, because neither is universally better — Meta's 3.3 update closed some of the gap that Nous was filling.
Hermes 3 405B scales the same fine-tune recipe to Meta's 405B base. This reaches frontier-tier capability for complex reasoning chains and long-context analysis tasks, but requires multi-GPU infrastructure and comes with substantially higher per-token pricing. Hosted coverage is limited to a small set of providers, primarily Lambda Labs. For agent orchestration where the 70B variants hit capability limits, this is the natural upgrade path within the Hermes family.
Pick Hermes 3 70B for agent frameworks where tool-schema adherence and reasoning traces matter. Pick Llama 3.3 70B for general-purpose instruction work where Meta's updated alignment tuning is sufficient. Pick Hermes 3 405B when 70B-class capability is demonstrably insufficient and the infrastructure cost is justified.
Compare two at a time
Frequently asked questions
- How does Hermes 3 Llama 3.1 405b compare to Hermes 3 Llama 3.1 70b and Llama 3.3 70b Instruct on price?
- Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
- Which model is best for coding: Hermes 3 Llama 3.1 405b, Hermes 3 Llama 3.1 70b, or Llama 3.3 70b Instruct?
- HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
- What is the context window for Hermes 3 Llama 3.1 405b, Hermes 3 Llama 3.1 70b, and Llama 3.3 70b Instruct?
- Context window sizes are listed in the Specs row of the comparison table above.
Full model details