Quick Reference: Feature Matrix
Every row is an approach. Every column is a capability. Green means it has it. Red means it doesn't. Amber means it has something close.
| Approach | Hierarchy | Stakeholders | Residue | Questions | Exceptions | Multi-root |
|---|---|---|---|---|---|---|
| Delphi | No | No | No | No | No | No |
| MoReBench | No | Partial | No | No | No | No |
| Social Chemistry | No | Implicit | No | No | Partial | No |
| Norm Bank | No | No | No | No | No | No |
| Foundation Alignment | Yes | No | No | No | Yes | No |
| MHF (ours) | Yes | Yes | Yes | Yes | Yes | Yes |
Approach-by-Approach Breakdown
Click any card to expand the full analysis. Each card shows what you get, what you miss, where we agree, and where we differ -- with actual numbers from our test suite.
Delphi (Allen AI)
What you get
- Instant good/bad/neutral classification
- Trained on 1.7M crowdsourced judgments
- Captures broad cultural consensus
- Fast, simple, deployable
What you miss
- Why the action is good or bad
- For whom -- which stakeholders are affected
- Under what hierarchy -- whose moral framework
- Exception logic -- when rules bend
- Moral residue -- what's lost in the tradeoff
Where we agree
- 92.9% direction match on satisfaction scores (Social Chemistry comparison)
- Clear cases -- "don't murder," "help people in need" -- consensus is consensus
- Delphi's data (Norm Bank, Social Chemistry) bootstraps our secular baseline weights
Where we differ
- Contested cases -- we add structure, they add a label
- "It's wrong to steal" is their final answer. Ours is: "steal, with residue 0.90, and here are the restorative actions"
- We produce different answers for different moral communities. They produce one answer for everyone.
MoReBench (Rubric Scoring)
What you get
- Process quality score across 5 dimensions
- Signed weights that penalize harmful reasoning
- Both "daily dilemma" and "expert case" coverage
- Measures HOW you reason, not just WHAT you conclude
What you miss
- Relational structure -- who else is in the person's life
- Hierarchy -- whose values govern the reasoning
- Moral residue -- what remains after the decision
- Context sensitivity -- same rubric for everyone
Where we agree
- Reasoning quality matters -- a good conclusion from bad reasoning is still bad
- Multi-dimensional evaluation beats single-score
- Process-oriented thinking is right
Where we differ
- Flat rubrics score 0.60-1.00 ("adequate"). Hierarchy rubrics score 0.05-0.53 (sharply differentiate)
- We add stakeholder coverage + hierarchy fidelity as scoring dimensions
- MoReBench never asks about spouse, children, church -- the stakeholders that change the calculus
Social Chemistry 101
What you get
- Crowdsourced rules-of-thumb (RoTs) for everyday situations
- Haidt moral foundation labels on each RoT
- Agreement levels -- how much the crowd agrees
- Cultural pressure scores
What you miss
- Relational context -- same RoT for stranger vs. spouse
- Exception logic -- when the rule bends
- Hierarchy -- who the rules serve
- Multi-turn reasoning -- no elicitation
Where we agree
- Foundation weights match -- our secular Haidt profile is derived from their labeled data
- 175,465 entries bootstrap our secular edge weights
- Agreement levels align with our constraint strengths on clear norms
Where we differ
- We add constraint propagation -- rules interact, not just list
- We add relational graphs -- "help others" means different things for spouse vs. stranger
- We add exception conditions with AND-gate logic
Commonsense Norm Bank
What you get
- Massive coverage -- 1.7M moral judgments
- Yes/no moral consensus at scale
- Covers 6 complexity levels
- Establishes what "most people agree on"
What you miss
- Why -- no reasoning chain
- When it bends -- no exception logic
- Hierarchy -- no framework priority
- Residue -- no cost tracking
Where we agree
- 100% agreement on clear-cut unambiguous cases
- "Don't steal," "don't lie," "help people" -- we match consensus perfectly
- 177,750 entries inform our secular Overton weights
Where we differ
- We add "why" -- constraint chains, not flat labels
- We add "when it bends" -- necessity exceptions with AND-gate conditions
- Contested norms (40-60% agreement): we produce stakeholders + residue where they produce a coin flip
Foundation Alignment (Seed v4.1)
What you get
- 99.4% adversarial defense rate
- Clear top node (God) with well-defined constraint gates
- Self-consistent hierarchy -- exceptions defined within the framework
- TLR Protocol: Truth, Love, Role as meta-constraints
What you miss
- User-facing moral advice -- it controls the MODEL, not the USER's dilemma
- Relational stakeholder graphs
- Multi-turn elicitation
- Parameterization for other moral traditions
Where we agree
- TLR maps directly to our root constraints -- Truth gate = no-lying, Love gate = care/harm, Role gate = authority/hierarchy
- Both use hierarchical constraint propagation
- Both encode exception logic within the framework, not external overrides
Where we differ
- Foundation Alignment makes the AI ethical. MHF helps HUMANS reason ethically. Complementary, not competing.
- We add relational graphs + elicitation + multi-parameterization
- We make the hierarchy parameterizable -- not just Christian, but any root
MHF (Moral Hierarchy Framework)
What you get
- Parameterized root node -- choose your moral authority explicitly
- Typed relational DAG with Haidt-space edge weights
- Lexicographic constraint propagation -- root constraints resolve first
- Moral residue tracking with restorative actions
- Uncertainty-driven elicitation -- asks the right questions
- Three-state stakeholders: CONFIRMED, HYPOTHESIZED, HIGH_IMPACT_UNKNOWN
- Sovereign mode for cross-framework cases
What we don't have yet
- Large-scale human evaluation study (v2 goal)
- Multi-columnist prediction experiment at scale
- Non-English moral traditions (v3 goal)
- Production-grade elicitation UX
- Social pressure as separate descriptive overlay
Where we agree with consensus
- 100% Norm Bank clear-case agreement
- 92.9% Social Chemistry satisfaction direction match
- 25/25 perturbation tests pass (100%)
- 9/9 adversarial critic tests pass
What we add on hard cases
- Structure: hierarchy + stakeholders + constraints + residue
- Different answers for different moral communities -- and explains WHY
- Multi-turn elicitation surfaces 6+ stakeholders LLMs never identify
- Action bundles via residue: "do X, then repair Y"
Under the Hood: Haidt Profile Divergence
The Christian and secular parameterizations diverge on specific, empirically documented dimensions. These ratios match the findings from Haidt, Graham, and Joseph (2009) on conservative-progressive moral psychology. Authority at 10x and Sanctity at 13.6x are precisely the dimensions Haidt identifies as distinguishing conservative from progressive moral cognition.
| Foundation | Christian | Secular | Ratio | Visual |
|---|---|---|---|---|
| Care / Harm | 0.80 | 0.47 | 1.7x | |
| Fairness / Cheating | 0.60 | 0.18 | 3.3x | |
| Loyalty / Betrayal | 0.75 | 0.19 | 3.9x | |
| Authority / Subversion | 0.90 | 0.09 | 10.0x | |
| Sanctity / Degradation | 0.95 | 0.07 | 13.6x | |
| Liberty / Oppression | 0.45 | 0.00* | -- |
*Liberty is 0.00 in the secular profile because Social Chemistry 101 does not label this dimension. Christian liberty (0.45) is constrained by love (Galatians 5:13).
Every existing approach to AI moral reasoning does one of two things: it either classifies actions as good or bad (Delphi, Norm Bank), or it scores reasoning processes against a fixed rubric (MoReBench). Both treat morality as flat -- the same checklist for a devout Christian, a secular utilitarian, and a Confucian filial-piety adherent. MHF does something none of them do: it takes the user's own moral hierarchy as a parameter, builds a relational graph of everyone affected by the decision, propagates constraints from the top down while evidence flows from the bottom up, and produces a prescriptive judgment that says "because of X, you should do Y, and here is what you lose by choosing it." It does not claim moral truth. It claims moral structure -- and structure is what you need when the easy answers run out and every path has a cost.