Hypothesis Evidence Overview | Moral Hierarchy Framework

At a glance

All seven prototype hypotheses have current pass evidence in the repo.

Evidence shape

Dyad swaps, residue outputs, divergent traces, perturbations, and benchmark-backed comparisons.

Boundary

These are implementation tests for the research prototype, not a claim that every public advice case is solved.

Hypotheses

7 / 7

Perturbation Tests

25 / 25

Critic Assertions

9 / 9

Norm Bank Match

100%

Social Chem Match

92.9%

✓

H1 Relation-typed edges differentiate dyads

The same request ("lie for me") from a boss, father, and stranger produces three different recommendations. Edge type + Haidt weights = different moral calculus, not flattened authority scores.

H2 Action bundles beat atomic actions

Moral residue IS the action bundle. "Take insulin" + residue {"repay pharmacist", "seek lawful remedy"} = compound moral advice, not a lone verb.

H3 Separate roots produce separate answers

Christian root: teacher speaks truth publicly. Social-approval root: teacher stays silent. Same scenario, different God, different output. That is the point.

H4 Sovereign mode beats DAG arithmetic fusion

Christian husband at Hindu ancestor rite: sovereign trace blocks idolatry, finds the middle path ("attend respectfully, don't offer"). No fake blended certainty score.

H5 Sensitivity-based elicitation asks better questions

HIGH_IMPACT_UNKNOWN nodes (spouse, children) rank highest in uncertainty. The engine asks about them first -- not generic "have you considered your feelings?"

H6 Perturbation tests flip at >= 80%

25 perturbation pairs across 5 families. Change one morally relevant variable, check that the recommendation changes. Threshold was 80%. We hit 100%.

H7 Christian/secular diverge on 50%+ of cases

Authority 10x higher. Sanctity 13.6x higher. The profiles don't just differ -- they diverge on exactly the dimensions Haidt's own research predicted.

How to read this overview

Each hypothesis was stated BEFORE implementation -- in the PLAN.md spec (Section 15). The confidence percentage reflects how predictable the result was given the framework's architecture: high confidence means the design made the outcome nearly certain, low confidence would mean the test could have gone either way.

The key metric on each card is the single number that most directly tests the hypothesis. For H6, that is the flip rate. For H7, it is the maximum divergence ratio. For H1, it is the number of dyad-swap pairs that produced different recommendations when the only change was the relationship type.

An external reviewer (Respondent #2) predicted five specific failure scenarios. All five passed -- 9/9 individual assertions. The perturbation test threshold was 80%; actual performance was 100% on 25 pairs. These are not cherry-picked results. The full test suite, scenario bank, and perturbation results are in the repository.

Explore each hypothesis in detail Back to evidence hub