IQ
Composite IQ, cost tradeoffs, frontier progress, then the dimension and benchmark evidence behind the score
Composite IQ
AI IQ combines abstract, mathematical, programmatic, and academic reasoning estimates. Missing coverage is conservatively filled only inside the scoring pipeline so omissions do not inflate scores.
Effective cost & iso-curves
Effective cost on the X-axis is token cost (cost for 2M input + 1M output tokens) × token usage multiplier (this model's AA token usage ÷ the median). It's what each model spends to do a task that the median model handles with that 2:1 token mix.
Iso-curves trace lines of equal preference. The dropdown picks the Y-axis metric (overall IQ, the four dimension IQs, or any of the 10 individual benchmarks). The 1:1 ratio control weights quality vs cost — at 1:1, one IQ point is worth one halving of cost; click right (1:2, 1:5…) to make cost matter more, left (2:1, 5:1…) to make quality matter more. Models above and to the right of a curve are strictly better.
Tracking frontier progress
This chart focuses on flagship frontier checkpoints rather than every SKU. It is the fastest way to see whether the leading model curve is actually moving.
Read the methodology for how raw benchmark values become dimension and composite IQ estimates.