What You Actually Get

Six approaches to moral reasoning. Each gets something right. Here is exactly what each one gives you, what it misses, and where the Moral Hierarchy Framework fills the gaps.

"Most AI moral reasoning works like a referee who calls fouls but never learned what sport is being played."

Quick Reference: Feature Matrix

Every row is an approach. Every column is a capability. Green means it has it. Red means it doesn't. Amber means it has something close.

Approach Hierarchy Stakeholders Residue Questions Exceptions Multi-root
Delphi No No No No No No
MoReBench No Partial No No No No
Social Chemistry No Implicit No No Partial No
Norm Bank No No No No No No
Foundation Alignment Yes No No No Yes No
MHF (ours) Yes Yes Yes Yes Yes Yes

Approach-by-Approach Breakdown

Click any card to expand the full analysis. Each card shows what you get, what you miss, where we agree, and where we differ -- with actual numbers from our test suite.

D

Delphi (Allen AI)

Flat moral judgment -- good / bad / neutral

What you get

  • Instant good/bad/neutral classification
  • Trained on 1.7M crowdsourced judgments
  • Captures broad cultural consensus
  • Fast, simple, deployable

What you miss

  • Why the action is good or bad
  • For whom -- which stakeholders are affected
  • Under what hierarchy -- whose moral framework
  • Exception logic -- when rules bend
  • Moral residue -- what's lost in the tradeoff

Where we agree

  • 92.9% direction match on satisfaction scores (Social Chemistry comparison)
  • Clear cases -- "don't murder," "help people in need" -- consensus is consensus
  • Delphi's data (Norm Bank, Social Chemistry) bootstraps our secular baseline weights

Where we differ

  • Contested cases -- we add structure, they add a label
  • "It's wrong to steal" is their final answer. Ours is: "steal, with residue 0.90, and here are the restorative actions"
  • We produce different answers for different moral communities. They produce one answer for everyone.
M

MoReBench (Rubric Scoring)

Multi-dimensional rubric -- 26 criteria, signed weights

What you get

  • Process quality score across 5 dimensions
  • Signed weights that penalize harmful reasoning
  • Both "daily dilemma" and "expert case" coverage
  • Measures HOW you reason, not just WHAT you conclude

What you miss

  • Relational structure -- who else is in the person's life
  • Hierarchy -- whose values govern the reasoning
  • Moral residue -- what remains after the decision
  • Context sensitivity -- same rubric for everyone

Where we agree

  • Reasoning quality matters -- a good conclusion from bad reasoning is still bad
  • Multi-dimensional evaluation beats single-score
  • Process-oriented thinking is right

Where we differ

  • Flat rubrics score 0.60-1.00 ("adequate"). Hierarchy rubrics score 0.05-0.53 (sharply differentiate)
  • We add stakeholder coverage + hierarchy fidelity as scoring dimensions
  • MoReBench never asks about spouse, children, church -- the stakeholders that change the calculus
S

Social Chemistry 101

356K rules-of-thumb with Haidt labels and agreement levels

What you get

  • Crowdsourced rules-of-thumb (RoTs) for everyday situations
  • Haidt moral foundation labels on each RoT
  • Agreement levels -- how much the crowd agrees
  • Cultural pressure scores

What you miss

  • Relational context -- same RoT for stranger vs. spouse
  • Exception logic -- when the rule bends
  • Hierarchy -- who the rules serve
  • Multi-turn reasoning -- no elicitation

Where we agree

  • Foundation weights match -- our secular Haidt profile is derived from their labeled data
  • 175,465 entries bootstrap our secular edge weights
  • Agreement levels align with our constraint strengths on clear norms

Where we differ

  • We add constraint propagation -- rules interact, not just list
  • We add relational graphs -- "help others" means different things for spouse vs. stranger
  • We add exception conditions with AND-gate logic
N

Commonsense Norm Bank

1.7M entries -- the Overton window of moral consensus

What you get

  • Massive coverage -- 1.7M moral judgments
  • Yes/no moral consensus at scale
  • Covers 6 complexity levels
  • Establishes what "most people agree on"

What you miss

  • Why -- no reasoning chain
  • When it bends -- no exception logic
  • Hierarchy -- no framework priority
  • Residue -- no cost tracking

Where we agree

  • 100% agreement on clear-cut unambiguous cases
  • "Don't steal," "don't lie," "help people" -- we match consensus perfectly
  • 177,750 entries inform our secular Overton weights

Where we differ

  • We add "why" -- constraint chains, not flat labels
  • We add "when it bends" -- necessity exceptions with AND-gate conditions
  • Contested norms (40-60% agreement): we produce stakeholders + residue where they produce a coin flip
F

Foundation Alignment (Seed v4.1)

TLR Protocol -- Truth, Love, Role -- 99.4% adversarial defense

What you get

  • 99.4% adversarial defense rate
  • Clear top node (God) with well-defined constraint gates
  • Self-consistent hierarchy -- exceptions defined within the framework
  • TLR Protocol: Truth, Love, Role as meta-constraints

What you miss

  • User-facing moral advice -- it controls the MODEL, not the USER's dilemma
  • Relational stakeholder graphs
  • Multi-turn elicitation
  • Parameterization for other moral traditions

Where we agree

  • TLR maps directly to our root constraints -- Truth gate = no-lying, Love gate = care/harm, Role gate = authority/hierarchy
  • Both use hierarchical constraint propagation
  • Both encode exception logic within the framework, not external overrides

Where we differ

  • Foundation Alignment makes the AI ethical. MHF helps HUMANS reason ethically. Complementary, not competing.
  • We add relational graphs + elicitation + multi-parameterization
  • We make the hierarchy parameterizable -- not just Christian, but any root
H

MHF (Moral Hierarchy Framework)

Hierarchy + Stakeholders + Residue + Questions -- v1

What you get

  • Parameterized root node -- choose your moral authority explicitly
  • Typed relational DAG with Haidt-space edge weights
  • Lexicographic constraint propagation -- root constraints resolve first
  • Moral residue tracking with restorative actions
  • Uncertainty-driven elicitation -- asks the right questions
  • Three-state stakeholders: CONFIRMED, HYPOTHESIZED, HIGH_IMPACT_UNKNOWN
  • Sovereign mode for cross-framework cases

What we don't have yet

  • Large-scale human evaluation study (v2 goal)
  • Multi-columnist prediction experiment at scale
  • Non-English moral traditions (v3 goal)
  • Production-grade elicitation UX
  • Social pressure as separate descriptive overlay

Where we agree with consensus

  • 100% Norm Bank clear-case agreement
  • 92.9% Social Chemistry satisfaction direction match
  • 25/25 perturbation tests pass (100%)
  • 9/9 adversarial critic tests pass

What we add on hard cases

  • Structure: hierarchy + stakeholders + constraints + residue
  • Different answers for different moral communities -- and explains WHY
  • Multi-turn elicitation surfaces 6+ stakeholders LLMs never identify
  • Action bundles via residue: "do X, then repair Y"

Under the Hood: Haidt Profile Divergence

The Christian and secular parameterizations diverge on specific, empirically documented dimensions. These ratios match the findings from Haidt, Graham, and Joseph (2009) on conservative-progressive moral psychology. Authority at 10x and Sanctity at 13.6x are precisely the dimensions Haidt identifies as distinguishing conservative from progressive moral cognition.

Foundation Christian Secular Ratio Visual
Care / Harm 0.80 0.47 1.7x
Fairness / Cheating 0.60 0.18 3.3x
Loyalty / Betrayal 0.75 0.19 3.9x
Authority / Subversion 0.90 0.09 10.0x
Sanctity / Degradation 0.95 0.07 13.6x
Liberty / Oppression 0.45 0.00* --

*Liberty is 0.00 in the secular profile because Social Chemistry 101 does not label this dimension. Christian liberty (0.45) is constrained by love (Galatians 5:13).

175K
Social Chem Entries
178K
Norm Bank Entries
6
Haidt Dimensions
5
Perturbation Families
3
Parameterizations

Every existing approach to AI moral reasoning does one of two things: it either classifies actions as good or bad (Delphi, Norm Bank), or it scores reasoning processes against a fixed rubric (MoReBench). Both treat morality as flat -- the same checklist for a devout Christian, a secular utilitarian, and a Confucian filial-piety adherent. MHF does something none of them do: it takes the user's own moral hierarchy as a parameter, builds a relational graph of everyone affected by the decision, propagates constraints from the top down while evidence flows from the bottom up, and produces a prescriptive judgment that says "because of X, you should do Y, and here is what you lose by choosing it." It does not claim moral truth. It claims moral structure -- and structure is what you need when the easy answers run out and every path has a cost.