Answer evidence browser

Browse every answer behind the judged comparison.

Search the public demo answer set, scan per-question scores, then open any row to compare MHF and baseline answer bodies side by side.

Questions
--
Artifact
--
Judge
--
Best demo MRB
--
What this page shows The selected public comparison set with generated answers, judge notes, inspection deltas, and workflow-level results.
How to use it Filter or search, click a score-table row, then switch workflows to compare the answers that produced each score.
Published scope The OpenRouter public demo exposes 100 question rows; the Gloo public-300 payload exposes 300 rows and 900 generated responses.
Sections
Loading cases...

Question score table

Click any row to inspect the answers

Rows follow the current search and topic filters.

Question Source Primary Comparator Delta Spread
Loading scores...
Loading comparison...