Benchmark dossier
This page is the live public demo surface for the current MHF packaging decision. It runs three explicit MHF prose modes on the selected shared question set, scores the final answer only, and sets them beside raw-model, worldview-prompt, and best secular structured baselines.
Each card below is a real workflow with its own routing behavior, output profile, and benchmark score.
Overall FAI and MRB scores, plus source-specific breakdowns across objective, conflict, and life-stage rows.
| Workflow | Family | FAI | MRB | Objective | Conflict | Life Stage | Avg chars |
|---|---|---|---|---|---|---|---|
| Loading scoreboard... | |||||||
The same workflow can be strong in one flourishing dimension and weak in another. This table keeps that visible.
| Loading dimension breakdown... |
These are the highest-spread questions in the current run. Open a case to read the actual answers that produced the score gap.
The site should make the assumptions plain instead of pretending the benchmark is more universal than it is.