Use-case preset

Structured extraction (JSON) cost calculator

Pull JSON fields from semi-structured text like resumes and listings.

A pipeline that reads semi-structured text — resumes, job listings, product descriptions — and emits a validated JSON object with defined fields. Most tokens are in the source document; the JSON output is compact, yielding an 80/20 input/output ratio. The 8k context window comfortably handles a 5–7 page document plus a field schema definition.

Latency is best-effort since extraction typically runs as a batch or background job. Caching the JSON schema definition (stable per document type) provides 20–30% input savings. The main failure mode is hallucinated fields: the model fills optional keys with plausible-sounding data when they're absent in the source. Enforce output validation with a JSON schema validator and reject rather than silently accept partial extractions. Smaller models are cost-efficient here if the field set is simple; complex nested schemas favor larger models.

Recommended models

alibaba/qwen-2.5-72b-instruct

Precise instruction following for structured JSON output; reliable on complex field schemas.

meta/llama-3.1-70b-instruct

Strong extraction accuracy with consistent JSON formatting across document types.

mistralai/mistral-small-3

Good structured output reliability at lower cost for simpler field sets.

alibaba/qwen-3-14b-instruct

Capable at JSON extraction tasks with favorable cost-per-document for batch workloads.