EarthPilotPersonality·Bench
Open data

Raw inventory

Browse every individual API call and every item-level response. Pick a model and instrument to see the exact items, the model’s raw Likert answer for each, the reverse-keyed scored value, and the aggregate dimension scores. Click into any single run to see the verbatim system prompt, user prompt, and raw model response. Bulk CSV downloads are linked below the picker.

Bulk CSV downloads

Auto-regenerated on every dataset refresh from the SQLite source of truth. Browsable on GitHub directly: