Use-case preset

Multi-step task agent cost calculator

Generic agent loop with planning, tool calls, and multi-turn context.

A generic agent loop that plans, calls tools, and synthesizes results across multiple turns. Each step sends the accumulated conversation plus tool definitions and results back as input, driving the 75/25 input/output ratio — tool call responses are verbose, model replies are targeted. The 16k context window holds 5–10 planning + tool-use turns before you need to summarize or truncate.

Latency is best-effort since multi-step tasks are inherently asynchronous. Caching the tool schema (stable) across turns captures ~40% of input tokens. Watch out for context explosion: tool results returned as raw JSON can balloon input size 3–5x within a few steps. Enforce a tool-output length cap in your harness and summarize intermediate results before re-injecting them. Token cost per task completion is the right unit of measure, not cost per API call.

Recommended models

meta/llama-3.3-70b-instruct

Strong tool-use and planning capabilities; reliable across multi-step reasoning chains.

deepseek/deepseek-r1

Excellent chain-of-thought reasoning for complex planning steps; competitive pricing.

mistralai/mistral-large-2

Robust function-calling support with good performance on agentic multi-turn tasks.

alibaba/qwen-3-72b-instruct

Strong instruction adherence across multi-step tasks; good at maintaining plan consistency.

nous/hermes-3-llama-3.1-70b

Fine-tuned for agentic workflows with strong tool-call accuracy and structured output.