Use-case preset
Code generation agent cost calculator
Generate multi-file code from a spec; long outputs, 5s latency budget.
A coding agent that accepts a spec and produces multi-file implementations: the prompt includes the spec, existing file tree, and style guidelines; the output is substantial generated code. The 60/40 input/output split is notably higher on output than most workloads — expect 2–4k output tokens per invocation. The 32k context window accommodates the full codebase context needed for coherent multi-file generation. The 5s p95 latency budget allows heavier models without interactive-tier pricing.
Caching the tool definitions and style guide (stable across invocations) saves roughly 35% of input cost. The main quality risk is context fragmentation: if you truncate the codebase to fit the context window, the model generates inconsistent interfaces. Prefer models with strong code benchmarks (HumanEval, SWE-bench) even at higher per-token cost — generation errors compound across files and correction is expensive.