The bring-your-own-AI benchmark on /ai grew up.
- 5 scenarios instead of one — invoice, receipt, quote, resume, report.
- 16+ providers, including OpenRouter routes for cross-vendor comparison.
- Sortable leaderboard — sort by latency, tokens, success rate, or scenario.
- Per-row PDF links — every generated artifact is downloadable so you can eyeball the actual output, not just the numeric score.
The benchmark runner is still the same scripts/tools/benchmark-models.ts; results are published as static JSON in the public dir and refreshed on demand.