Use-case preset
Email summarization cost calculator
Batch inbox triage: condense email threads to 2–3 bullet summaries.
Inbox-triage pipeline: each job ingests an email thread (subject, sender metadata, body of each message) and emits a 2–3 bullet summary. Runs as a batch workload — latency is irrelevant, throughput and cost-per-email are what matter.
The 90/10 ratio reflects the thread-heavy prompt versus the compact output. A 4k context window fits threads up to roughly 3k words, which covers the 95th percentile of business email threads. Batch processing means best-effort latency — you can queue hundreds of thousands of emails overnight without paying a latency premium. `cachedPromptPercent` sits at ~10: each thread is unique, so cache hit rate is low unless you're reusing a heavy system prompt. The biggest cost lever here is model selection — a 7B–8B instruction-tuned model handles summarization well and runs 5–10× cheaper than a 70B without meaningful quality loss at this task complexity.