Model crosswalk
Side-by-side on price, capability and workload — three-way comparison.
DeepSeek R1 Distill Llama 70B
vs
Hermes 3 Llama 3.1 70B
vs
Refact Llama 3.1 70B
DeepSeek R1 Distill Llama 70BA
DeepSeek R1 Distill Llama 70B
70B params · 131K context · mit
Cheapest provider—
$/1M input—
$/1M output—
Hermes 3 Llama 3.1 70BB
Hermes 3 Llama 3.1 70B
70B params · 131K context · llama-3
Cheapest provider—
$/1M input—
$/1M output—
Refact Llama 3.1 70BC
Refact Llama 3.1 70B
70B params · 131K context · llama-3
Cheapest provider—
$/1M input—
$/1M output—
Specs and cheapest providers
| Spec | DeepSeek R1 Distill Llama 70B | Hermes 3 Llama 3.1 70B | Refact Llama 3.1 70B |
|---|---|---|---|
| Parameters | 70B | 70B | 70B |
| Context window | 131K tokens | 131K tokens | 131K tokens |
| License | mit | llama-3 | llama-3 |
| Released | 2025-01-20 | 2024-08-12 | 2024-09-01 |
| Cheapest provider | |||
| Provider | — | — | — |
| Input / 1M tokens | — | — | — |
| Output / 1M tokens | — | — | — |
Benchmark comparison
No benchmark data available yet.
Editor's take
DeepSeek R1 Distill Llama 70B, Hermes 3 Llama 3.1 70B, and Refact Llama 3.1 70B all start from the same 70B Llama base but diverge sharply in fine-tuning focus. Choosing between them is less about parameter count and more about what the fine-tune was actually optimized to do.
DeepSeek R1 Distill Llama 70B, released January 2025, distills reasoning-chain supervision from the full 671B R1 MoE into a Llama 3.3 70B base. Independent benchmarks place it at roughly 70–80 percent of full R1's scores on AIME and MATH, at a fraction of the inference cost of running the full 671B model. Groq's hardware makes it one of the faster 70B options for latency-sensitive reasoning workloads. MIT license allows fully commercial deployment.
Hermes 3 Llama 3.1 70B is a fine-tune by Nous Research, released August 2024 with 131K context. The training recipe emphasizes persona fidelity, explicit XML-tagged reasoning traces, and reduced RLHF-induced refusals compared to the vanilla Meta Instruct release. For agent frameworks where the model needs to follow tool schemas, maintain system prompt personas across a long turn, and produce structured output without softening, Hermes 3 is a near-zero-cost swap over the base Llama 3.1 70B Instruct with measurable improvements. The Llama 3 community license applies.
Refact Llama 3.1 70B is a fine-tune by Together Computer and Refact AI, released September 2024, targeting code tab-completion and refactoring agent workflows in IDE-embedded pipelines. The 128K context window fits large file trees and multi-file diffs. Outside of that specific niche — IDE products and agentic code refactoring loops — the general-purpose Llama 3.1 70B Instruct remains the more versatile option. Llama 3 community license is inherited.
Pick DeepSeek R1 Distill for multi-step mathematical reasoning at 70B cost. Pick Hermes 3 for agent and tool-use pipelines that need persona fidelity and structured output. Pick Refact for code IDE integrations and file-tree-level refactoring workflows.
Compare two at a time
Frequently asked questions
- How does DeepSeek R1 Distill Llama 70B compare to Hermes 3 Llama 3.1 70B and Refact Llama 3.1 70B on price?
- Use the table above to compare input and output prices per 1M tokens across the cheapest available providers for each model.
- Which model is best for coding: DeepSeek R1 Distill Llama 70B, Hermes 3 Llama 3.1 70B, or Refact Llama 3.1 70B?
- HumanEval and other code benchmarks are shown in the table. For production code tasks, also consider context window size and provider latency.
- What is the context window for DeepSeek R1 Distill Llama 70B, Hermes 3 Llama 3.1 70B, and Refact Llama 3.1 70B?
- Context window sizes are listed in the Specs row of the comparison table above.
Full model details