Use-case preset
Early-stage startup (100M tokens/mo) cost calculator
Real production usage at 100M tokens/mo; cost-per-quality and rate limits matter.
At 100M tokens/month with a 70/30 input/output split and 8k average context, you're spending roughly $80–150/mo on a mid-tier model at $0.9/M input and $0.9/M output tokens. That's a sustainable R&D budget before revenue kicks in. Under-5s p95 latency keeps the product feeling responsive without requiring the most expensive real-time infrastructure.
Rate-limit headroom is the hidden constraint: many providers cap new accounts at 10–50 RPM, which becomes a bottleneck before cost does. Cached prompts at 40% reflects that you likely have a stable system prompt but variable user messages. Prioritize models that offer free tier or low-commitment monthly plans so you can iterate without locking in. Cost-per-quality ratio matters more than raw cost — don't over-optimize for cheapest before you've shipped.