Open data
Raw inventory
Browse every individual API call and every item-level response. Pick a model and instrument to see the exact items, the model’s raw Likert answer for each, the reverse-keyed scored value, and the aggregate dimension scores. Click into any single run to see the verbatim system prompt, user prompt, and raw model response. Bulk CSV downloads are linked below the picker.
Bulk CSV downloads
Auto-regenerated on every dataset refresh from the SQLite source of truth. Browsable on GitHub directly:
- models.csv — one row per model (vendor, pricing, family)
- instruments.csv — one row per active instrument
- runs.csv — every completed API call with token + cost telemetry
- scores.csv — every (run, dimension) → mean
- responses.csv — most granular: every item, every raw answer (~130K rows)
- per_model_summary.csv — flat wide table for quick exploration
- cohort_summary.csv — cohort mean/min/max per (instrument, dimension, framing)