← All updates

AI leaderboard with five scenarios

The /ai page is now a sortable leaderboard across five scenarios and 16+ providers, with per-row download links to every generated PDF.

aibenchmark

The bring-your-own-AI benchmark on /ai grew up.

  • 5 scenarios instead of one — invoice, receipt, quote, resume, report.
  • 16+ providers, including OpenRouter routes for cross-vendor comparison.
  • Sortable leaderboard — sort by latency, tokens, success rate, or scenario.
  • Per-row PDF links — every generated artifact is downloadable so you can eyeball the actual output, not just the numeric score.

The benchmark runner is still the same scripts/tools/benchmark-models.ts; results are published as static JSON in the public dir and refreshed on demand.