Use-case preset
Content moderation cost calculator
Classify user-generated content at high throughput; tiny output, large volume.
A high-throughput classifier that reads user-generated content and emits one of three labels: safe, flag, or block. The prompt is the content plus a short policy rubric; the output is a structured label plus optional reasoning. The 95/5 input/output ratio reflects that: nearly all tokens are input, the reply is tiny. The 1k context cap enforces strict input budgets — content that exceeds it gets truncated or rejected at the pipeline level.
At moderation volumes (millions of items per day), input cost dominates everything. A 50% cache rate covers the stable policy rubric injected on every call. The key trade-off is accuracy versus throughput: a tiny model (1–3B) maximizes throughput and minimizes cost but may miss edge cases. Run a precision/recall benchmark against your policy before deploying small models — false negatives in moderation carry real reputational and legal risk.