Every other moral dataset asks: "Is this action right or wrong?" UniMoral asks the harder question: "Is this action right or wrong for someone with these specific moral foundations?" That makes it the perfect validation target for MHF.
UniMoral is a multilingual moral evaluation dataset covering six languages: Arabic, Chinese, English, Hindi, Russian, and Spanish. But what makes it extraordinary is not the breadth -- it is the depth. Every annotator in UniMoral completed the MFQ2 (Moral Foundations Questionnaire, Second Edition), giving researchers their individual Haidt moral profile. Then those same annotators judged a set of moral dilemmas.
This means, for the first time, we can ask: does knowing someone's moral foundations profile predict their moral judgment? If it does, then MHF's architecture -- which parameterizes moral reasoning via Haidt foundation weights -- is validated at the individual level, not just the cultural one.
This is MHF's most direct validation opportunity. The experimental design is simple and devastating:
Take an annotator's own Haidt foundation weights. Plug them into MHF's parameterization. Run the dilemma through the graph. Compare the output to their actual judgment.
Take annotator's MFQ2 scores
Parameterize MHF graph with their profile as theta vector
Run constraint propagation on the same dilemma they judged
If MHF can predict individual annotator judgments better than a flat classifier that ignores moral profiles, the core thesis holds: moral reasoning is parameterized constraint propagation, not flat classification.
Why this is different from what anyone else has done: Delphi predicts the crowd average. ETHICS checks against philosophical principles. Neither can explain why two equally thoughtful people disagree. MHF, parameterized per-annotator via UniMoral's MFQ2 data, can predict which person will say "yes" and which will say "no" -- and explain the difference in terms of their Care vs. Authority vs. Sanctity weights.
UniMoral's language coverage maps onto genuinely different moral cultures. Arabic and Hindi annotators tend to score higher on Authority and Sanctity. English annotators weight Care and Fairness more heavily. Chinese annotators show a distinctive pattern in Loyalty. These are not stereotypes -- they are measured MFQ2 distributions. The bars below are representative profiles based on published Haidt research for similar populations.
The pattern is clear. WEIRD (Western, Educated, Industrialized, Rich, Democratic) populations weight Care and Fairness heavily while deprioritizing Authority and Sanctity. Non-WEIRD populations have a more balanced profile -- or even reverse the hierarchy. This is not a bug in the data. It is the core phenomenon MHF is designed to capture.
| MHF Prediction | What UniMoral Can Test | Expected Result |
|---|---|---|
| High-Authority annotators differ from low-Authority | Split annotators by Authority MFQ2 score, compare judgments on authority-relevant dilemmas | Systematic divergence in obedience/conformity scenarios |
| High-Sanctity annotators differ from low-Sanctity | Same split on Sanctity, compare purity/sexuality dilemmas | Systematic divergence in disgust/purity scenarios |
| Individual theta predicts better than population mean | Compare prediction accuracy: per-annotator MHF vs. flat classifier | MHF per-annotator > flat by measurable margin |
| Cross-language profiles match Haidt literature | Compare average MFQ2 by language to published norms | Arabic/Hindi high Auth+Sanct; English high Care+Fair |
| MHF Christian params match high-Auth+Sanct annotators | Cluster annotators by profile similarity to Christian weights | High-similarity cluster disproportionately from Arabic, Hindi subsets |
The entire field of moral AI has been building systems that predict the average judgment of a culturally narrow population. This is useful for content moderation. It is useless for moral advice.
When someone asks "Should I leave my alcoholic father?", the right answer depends on their moral commitments -- not the average American's. A person high in Authority and Loyalty faces a genuinely different moral landscape than a person high in Care and Liberty. UniMoral is the first dataset that lets us test whether MHF can navigate that difference.
If it works -- if per-annotator parameterization predicts judgments better than flat classification -- then we have demonstrated something the field has not seen: a moral reasoning system that respects moral diversity without collapsing into relativism. The hierarchy is not "anything goes." It is "given your commitments, here is what consistency demands." UniMoral lets us prove it.