UniMoral -- Moral Hierarchy Framework

Evidence brief

What UniMoral can and cannot support

UniMoral is the clearest proposed test of whether per-person moral-foundation weights improve prediction over population averages. In the current project state, this remains an integration target rather than a completed result.

Signal Six languages plus MFQ2 profiles and VSM cultural values.

The dataset links annotator profiles to moral judgments, which matches MHF's parameterization surface.

MHF use Parameterize the graph with an annotator's profile, then compare predicted and actual judgments.

This is framed as a falsifiable validation experiment, not as a completed claim of accuracy.

Limits Smaller scale, sample bias, and stated-value measurement all constrain the inference.

MFQ2 profiles are useful, but they are not the same as revealed preferences in full relational contexts.

Dataset shape

Moral judgments with moral profiles attached

UniMoral is a multilingual moral evaluation dataset covering six languages: Arabic, Chinese, English, Hindi, Russian, and Spanish. What makes it useful here is not only the breadth -- it is the depth. Every annotator in UniMoral completed the MFQ2 (Moral Foundations Questionnaire, Second Edition), giving researchers their individual Haidt moral profile. Then those same annotators judged a set of moral dilemmas.

This means we can ask: does knowing someone's moral foundations profile predict their moral judgment? If it does, then MHF's architecture -- which parameterizes moral reasoning via Haidt foundation weights -- has individual-level evidence to test, not just cultural-level evidence.

6

Languages

MFQ2

Haidt profile per annotator

VSM

Cultural values measured

1st

Profile-to-judgment dataset

Validation experiment

Parameterize MHF with their profile. Predict their verdict.

This is MHF's most direct validation opportunity. The experimental design is simple and falsifiable:

Individual Moral Prediction Pipeline

Take an annotator's own Haidt foundation weights. Plug them into MHF's parameterization. Run the dilemma through the graph. Compare the output to their actual judgment.

Step 1

Take annotator's MFQ2 scores

Care: 4.2 / Auth: 3.8 / Sanc: 4.5 ...

→

Step 2

Parameterize MHF graph with their profile as theta vector

theta = normalize(MFQ2_scores)

→

Step 3

Run constraint propagation on the same dilemma they judged

predict(dilemma, theta) vs actual_judgment

If MHF can predict individual annotator judgments better than a flat classifier that ignores moral profiles, the core thesis holds: moral reasoning is parameterized constraint propagation, not flat classification.

Auth

2.8

Sanct

3.1

The pattern is clear. WEIRD (Western, Educated, Industrialized, Rich, Democratic) populations weight Care and Fairness heavily while deprioritizing Authority and Sanctity. Non-WEIRD populations have a more balanced profile -- or even reverse the hierarchy. This is not a bug in the data. It is the core phenomenon MHF is designed to capture.

MHF test surface

What UniMoral tells us about parameterization

MHF Prediction	What UniMoral Can Test	Expected Result
High-Authority annotators differ from low-Authority	Split annotators by Authority MFQ2 score, compare judgments on authority-relevant dilemmas	Systematic divergence in obedience/conformity scenarios
High-Sanctity annotators differ from low-Sanctity	Same split on Sanctity, compare purity/sexuality dilemmas	Systematic divergence in disgust/purity scenarios
Individual theta predicts better than population mean	Compare prediction accuracy: per-annotator MHF vs. flat classifier	MHF per-annotator > flat by measurable margin
Cross-language profiles match Haidt literature	Compare average MFQ2 by language to published norms	Arabic/Hindi high Auth+Sanct; English high Care+Fair
MHF Christian params match high-Auth+Sanct annotators	Cluster annotators by profile similarity to Christian weights	High-similarity cluster disproportionately from Arabic, Hindi subsets

Strengths & Weaknesses

Honest assessment

✓ Strengths

Individual profiles. MFQ2 per annotator -- first dataset to connect moral psychology to moral judgment at the individual level
Multilingual. Six languages spanning four language families and multiple moral cultures
Cultural values. VSM (Values Survey Module) provides Hofstede-compatible cultural dimensions alongside Haidt foundations
Clean experimental design. Same dilemmas across languages enable cross-cultural comparison
Directly tests MHF. The parameterization architecture maps to MFQ2 profiles

⚠ Weaknesses

Smaller scale. Far fewer entries than AITA's 270K -- may lack statistical power for fine-grained splits
Annotator sample bias. Who volunteers for moral annotation studies is not random
No relational structure. Dilemmas do not include stakeholder graphs (same limitation as AITA)
MFQ2 captures stated values, not revealed preferences. What people say they value vs. how they actually judge may diverge
Six languages is not six cultures. Hindi-speaking India contains more moral diversity than this simplification captures

Reader takeaway

Why UniMoral matters for moral AI

The entire field of moral AI has been building systems that predict the average judgment of a culturally narrow population. This is useful for content moderation. It is useless for moral advice.

When someone asks "Should I leave my alcoholic father?", the right answer depends on their moral commitments -- not the average American's. A person high in Authority and Loyalty faces a genuinely different moral landscape than a person high in Care and Liberty. UniMoral is the first dataset that lets us test whether MHF can navigate that difference.

If it works -- if per-annotator parameterization predicts judgments better than flat classification -- then it would support a narrower claim: a moral reasoning system can respect moral diversity without collapsing into relativism. The hierarchy is not "anything goes." It is "given your commitments, here is what consistency demands." UniMoral lets us test that claim.