Use-case preset

Code review assistant cost calculator

PR diff and style guide in; inline review comments out.

A code review assistant receives a PR diff plus a project style guide and emits inline comments flagging bugs, style violations, and security issues. Diffs with surrounding context commonly run 20–28k tokens; comments are concise — typically 10–30 tokens each — yielding the 90/10 input/output split and 32k context ceiling.

The `under_5s_p95` target keeps review feedback snappy enough to land before the developer context-switches. cachedPromptPercent of 40 covers the stable system prompt, linting rules, and style guide prefix shared across every PR. Coding-specialist models (Codestral, Qwen-2.5-Coder, DeepSeek-Coder) consistently outperform general instruction models on diff comprehension at equivalent parameter counts. Quantized 32B coding models typically hit 90%+ of 70B accuracy at 40% lower cost — worth A/B testing in high-volume pipelines.

Recommended models

mistralai/codestral-22b

Purpose-built for code tasks; best-in-class diff comprehension at 22B parameter count.

alibaba/qwen-2.5-coder-32b-instruct

Strong code understanding at 32k context; reliably spots security anti-patterns in diffs.

deepseek/deepseek-coder-v2-instruct

High accuracy on code review benchmarks; good at flagging logical errors in multi-file diffs.

meta/llama-3.3-70b-instruct

General 70B falls back cleanly when coding specialists are unavailable; solid style-guide adherence.

alibaba/qwen-3-32b-instruct

Compact 32B with strong reasoning; handles complex refactoring diffs well.