0 providers50 models

Use-case preset

Multi-turn customer service cost calculator

Long support conversations with case history and KB chunks.

Multi-turn support conversations typically include a persistent system prompt with agent guidelines, the customer's case history, and 2–5 relevant knowledge-base chunks retrieved per turn. Output is short: acknowledgments, follow-up questions, or resolution steps. The 85/15 input/output ratio reflects that context dominates token spend.

Context is set to 16k to hold a full case thread plus KB snippets without truncation. Prompt caching saves 50–70% on input costs because the system prompt and KB chunks repeat across every turn in a session. Latency is best-effort; most support queues can tolerate a few seconds. The main cost lever here is cache hit rate — segment sessions cleanly and keep the cached prefix stable. Watch for cache misses caused by dynamic timestamps or session metadata injected before the KB chunks.

Recommended models

Strong instruction following and long-context coherence at mid-tier cost; good default for support workloads.
Competitive input pricing with solid multi-turn consistency; handles KB chunk injection well.
Reliable instruction adherence and low hallucination rate on factual KB retrieval.
Built-in RAG tooling and native prompt caching support make it a natural fit for KB-augmented support.
Low input cost with strong reasoning; useful when ticket complexity requires nuanced responses.