Use-case preset
Structured extraction (JSON) cost calculator
Pull JSON fields from semi-structured text like resumes and listings.
A pipeline that reads semi-structured text — resumes, job listings, product descriptions — and emits a validated JSON object with defined fields. Most tokens are in the source document; the JSON output is compact, yielding an 80/20 input/output ratio. The 8k context window comfortably handles a 5–7 page document plus a field schema definition.
Latency is best-effort since extraction typically runs as a batch or background job. Caching the JSON schema definition (stable per document type) provides 20–30% input savings. The main failure mode is hallucinated fields: the model fills optional keys with plausible-sounding data when they're absent in the source. Enforce output validation with a JSON schema validator and reject rather than silently accept partial extractions. Smaller models are cost-efficient here if the field set is simple; complex nested schemas favor larger models.