Use-case preset
Email drafting cost calculator
Draft email replies from a short prompt or thread snippet.
An email drafting assistant that takes a 1–2 paragraph prompt or thread snippet and returns a polished reply. Input and output are roughly equal in length, hence the 50/50 ratio. The 2k context window is intentionally tight — email threads are short, and padding with irrelevant history increases cost without improving quality.
Latency is best-effort; users submitting an email draft can tolerate a few seconds. Because every email carries a different prompt and thread, caching is low value (15%). The cost lever is model size: email drafting is a fluency task, not a reasoning task, so a well-tuned 7–8B model typically matches a 70B model in user satisfaction at one-fifth the cost. Run an A/B on your user base before committing to a larger model. The real cost driver at scale is call volume, not context length.