Use-case preset

Recruiting candidate screening cost calculator

Parse resume and JD into a fit score with reasoning; batch workload.

Each prompt contains a full resume (1–3 pages) plus a job description, totaling 4–8k tokens. Output is a score, a fit summary, and a list of qualification gaps — typically 300–600 tokens. The 85/15 ratio reflects the document-heavy input. Context is set to 16k to handle senior-role JDs that include detailed technical requirements alongside longer resumes.

Batch processing is appropriate; hiring managers check results at the start of their shift, not in real time. Cached prompt percent is 30–40% because the JD repeats across all candidates for a given role, while the resume portion is unique each time. The main accuracy risk is false negatives — set up a calibration set with human-labeled decisions and measure recall before deploying. At scale, a small model fine-tuned on your labeled data often outperforms a large general model on this structured scoring task.

Recommended models

meta/llama-3.3-70b-instruct

Strong structured reasoning on document comparison tasks; reliable score calibration.

alibaba/qwen-2.5-72b-instruct

Good extraction accuracy on resume fields; competitive batch pricing.

mistralai/mistral-large-2

Low hallucination rate on qualification gap analysis; consistent output formatting.

deepseek/deepseek-r1-distill-llama-70b

Reasoning-optimized distill; useful when nuanced multi-criteria scoring is required.

cohere/command-r

Cost-effective for high-volume screening runs where budget is the primary constraint.