Use-case preset

Log analysis cost calculator

Anomaly detection and clustering over server logs; best-effort.

Log analysis jobs paste a 10k–15k token window of structured log lines (JSON, syslog, or mixed) into the prompt and ask the model to identify anomalies, cluster error patterns, or summarise failure modes. Input tokens dominate at 90%; output is a short structured report, typically 200–500 tokens. The 16k window fits a meaningful slice of a production incident without chunking.

Best-effort latency reflects that this runs as a scheduled job or post-incident analysis, not in the hot path. Cache hit rate is low (0–15%): log content changes every run, and only the system prompt and schema description are reusable. Cost lever: pre-filter logs with a regex or embedding-based deduplication step before sending to the LLM — reducing a 10k-token window to 6k tokens cuts cost 40% with minimal quality loss. Models with strong context utilisation at long lengths matter more than raw capability score here.

Recommended models

meta/llama-3.3-70b-instruct

Strong long-context comprehension; reliable anomaly clustering across large log windows.

alibaba/qwen-3-32b-instruct

32B with good context utilisation at 16k; cost-effective for batch log processing.

deepseek/deepseek-v3

Strong structured analysis output; handles mixed log formats and pattern clustering well.

mistralai/mixtral-8x22b-instruct

MoE efficiency at 22B active params; fast batch throughput for large log volumes.

meta/llama-3.1-70b-instruct

Solid 70B long-context fidelity for complex multi-service incident logs.