Haidt's Moral Foundations Theory gives us six dimensions (care, fairness, loyalty, authority, sanctity, liberty). But a boss and a father are both high-authority -- yet the moral obligations they carry are fundamentally different. A boss has obligation_type = "comply"; a father has obligation_type = "honor". The framework must differentiate them.

Five dyad-swap perturbation tests. Same request, different relationship. If the recommendations come out identical, the hypothesis fails.

Test Base Dyad Perturbed Dyad Base Action Perturbed Action Result
dyad_swap_01 Roommate asks to lie Spouse asks to lie decline_to_lie agree_but_express_discomfort PASS
dyad_swap_02 Boss asks for overtime Father asks for help comply_with_boss_request help_father PASS
dyad_swap_03 Friend confides affair Stranger confides affair encourage_confess_set_deadline express_disapproval_disengage PASS
dyad_swap_04 Sibling asks for $5,000 Coworker asks for $5,000 lend_with_written_terms decline_politely PASS
dyad_swap_05 Neighbor needs snow help Stranger needs car help help_neighbor offer_brief_help_or_call_911 PASS

Boss Edge

obligation_type = "comply"

  • authority: 0.70
  • loyalty: 0.45
  • care: 0.40
  • base_weight: 0.50

Father Edge

obligation_type = "honor"

  • authority: 0.85
  • loyalty: 0.75
  • care: 0.80
  • base_weight: 0.85

Both are authority-heavy, but the father edge carries 1.7x the base weight, higher loyalty (covenant vs. contract), and "honor" instead of "comply". The constraint propagation engine uses all three signals.

The critic's sharpest prediction: "Roommate and sibling theft will collapse into the same fairness/loyalty bucket." Here is the code that disproves it.

# Roommate edge -- reciprocity, low loyalty Edge(source="self", target="roommate", obligation_type="reciprocity", base_weight=0.4, haidt_weights=HaidtProfile( care=0.2, fairness=0.8, loyalty=0.1, authority=0.1)) # Sibling edge -- family loyalty, high care Edge(source="self", target="sibling", obligation_type="family_loyalty", base_weight=0.7, haidt_weights=HaidtProfile( care=0.7, fairness=0.5, loyalty=0.8, authority=0.3)) # Result: same $200 theft, different recommendations # Roommate = strict_demand | Sibling = grace_with_terms assert d_room.recommended_action != d_sib.recommended_action # PASS

Result: 5/5 dyad-swap pairs produce different recommendations. The obligation_type + Haidt profile combination prevents relational collapse.

Real moral advice is never a single verb. When someone steals insulin to save a dying child, the morally intelligible answer is not just "steal" -- it is "take the insulin, repay the pharmacist, seek lawful remedy." The framework must output action bundles, not atomic choices.

Instead of enumerating exponential action combinations, the Decision struct carries a primary action plus ResidueItem entries for every constraint that was sacrificed or only partially satisfied. Each residue item includes restorative_actions -- the repair steps.

recommended_action take_insulin
relaxed_constraints do-not-steal (strength: 0.90)
residue[0].constraint do-not-steal
residue[0].score 0.90
residue[0].restorative "Repay or return what was taken as soon as the emergency passes."
residue[0].restorative "Seek lawful repair or explanation with the harmed party."
confidence 0.85

Atomic Action (Before)

"steal"

One verb. No repair path. No moral cost acknowledged. A user reads this and reasonably concludes the system endorses theft without qualification.

Action Bundle (After)

"take insulin" + residue:

  • Repay the pharmacist
  • Seek lawful remedy
  • Residue score: 0.90 (high moral cost acknowledged)

The system says: "You may take the insulin because a child will die. But this costs you something. Here is what you owe."

# do-not-steal has AND-logic exception conditions: # BOTH conditions must be present to trigger RELAX Constraint(id="do-not-steal", strength=0.90, exception_conditions=[ Condition( trigger="Life at imminent risk, no alternatives", evidence_required=[ "imminent_death_risk", # must be True "legal_alternatives_exhausted" # must be True ], effect="RELAX")]) # Result: constraint relaxed, residue generated assert "do-not-steal" in relaxed_ids # PASS assert steal_residue[0].residue_score > 0.5 # PASS (actual: 0.90)

Result: The residue mechanism converts atomic-action output into compound moral advice without combinatorial explosion. Tractable and complete.

If someone chooses "Social Approval" as their root moral authority, the system should correctly model that -- and the output should make the consequences transparent. Under a Christian root, the same scenario should produce a different answer. The framework separates the constitutional source from social drift.

A teacher knows their school's new policy is harmful to students. Speaking up risks career consequences. The school administration pressures compliance.

Christian Root

Root constraint: "Speak the truth in love" (strength: 0.92)

  • truth satisfaction: +0.9
  • student care: +0.7
  • Result: speak_truth_publicly

"Truth matters more than comfort." The root constraint outweighs social cost.

Social Approval Root

Root constraint: "Maintain social harmony" (strength: 0.85)

  • harmony satisfaction: +0.7
  • conflict avoidance: +0.8
  • Result: stay_silent

"Your God is social approval." The system makes this transparent, not hidden.

The divergence IS the feature, not a bug. When the system tells someone "your root says stay silent because your God is social approval," that statement critiques itself. Making the hierarchy explicit means the consequences of that choice are visible. Most moral frameworks hide this. MHF shows it.

# From test_critic_scenarios.py -- both assertions pass # Christian root: assert decision.recommended_action == "speak_truth_publicly" # PASS # Social approval root: assert decision.recommended_action == "stay_silent" # PASS

When two moral frameworks collide in one dilemma (interfaith marriage, cross-cultural obligation), fusing them into a single DAG produces "fake blended certainty" -- a number that claims confidence neither tradition actually has. The framework must run separate traces and surface the conflict transparently.

Christian husband, Hindu wife. Her family holds an ancestor offering ceremony (puja). Participating means making an offering to ancestors -- which the Christian tradition considers idolatry ("No other gods before Me," strength: 0.98). Refusing entirely damages the marriage and in-law relationship.

Action no-idolatry (0.98) love-spouse (0.90) honor-inlaws (0.65) Status
participate_fully -0.95 +0.7 +0.9 BLOCKED
attend_respectfully_no_offer +0.6 +0.5 +0.4 SELECTED
refuse_entirely +0.9 -0.5 -0.8 feasible but costly

If you fused Christian and Hindu frameworks into one DAG, the "no-idolatry" constraint (0.98) would be averaged with the Hindu "honor ancestors" obligation (also very high). The result: a blended score near 0.5 that says "somewhat participate." Neither tradition would endorse this output. The Christian framework says "you cannot make the offering"; the Hindu framework says "you must make the offering." Arithmetic fusion produces a position that belongs to no one.

In sovereign mode, the system runs the Christian trace and says: "Attend respectfully but do not make the offering. Under your wife's framework, full participation honors the ancestors. The conflict is genuine. It cannot be resolved by arithmetic."

# From test_critic_scenarios.py, TestCriticScenario4 assert decision.recommended_action != "participate_fully" # PASS assert decision.recommended_action == "attend_respectfully_no_offer" # PASS assert len(decision.residue) > 0 # PASS

Generic moral clarification asks "have you considered how you feel?" regardless of the scenario. Graph-based elicitation targets the nodes with highest U(node) = entropy * sensitivity -- the nodes that are both poorly known AND decision-relevant.

Spouse
1.0 HIU
Children
1.0 HIU
Church
0.8 HYP
Father
0.2 CFM
Self
0.2 CFM
HIGH_IMPACT_UNKNOWN (entropy = 1.0) HYPOTHESIZED (entropy = 0.8) CONFIRMED (entropy = 0.2)

Generic Questions

  • "Have you considered all perspectives?"
  • "How does this make you feel?"
  • "What values are most important to you?"

Information gain: low. None of these surface hidden stakeholders.

Graph-Targeted Questions

  • "Do you have a spouse or partner who would be affected?"
  • "Are there children or dependents in the household?"
  • "Does a faith community play a role in your decision-making?"

Information gain: high. Each question targets a HIGH_IMPACT_UNKNOWN node.

# From framework/elicitation.py def compute_node_uncertainty(node, graph, actions, sat_matrix): entropy = _evidence_entropy(node) # HIGH_IMPACT_UNKNOWN = 1.0, HYPOTHESIZED = 0.8, CONFIRMED = 0.2-0.5 sens = _sensitivity(node, graph, actions, sat_matrix) # 1.0 if flipping evidence changes the recommended action # 0.5 if confidence shifts >= 0.15 # 0.1 if no meaningful change return entropy * sens # U(node) -- prioritizes poorly-known AND decision-relevant

Result: The elicitation engine consistently ranks HIGH_IMPACT_UNKNOWN nodes (spouse, children) above confirmed nodes (father, self). Questions target what matters, not what is easy to ask.

When one morally relevant variable changes, the recommendation should change. If the framework is insensitive to morally relevant facts, it is not doing moral reasoning -- it is doing pattern matching. The external reviewer set the bar at 80% flip rate.

Elder Care (5/5)

ID Variable Changed Base Action Perturbed Action Flip
elder_care_01 danger_level=high, dependents=children full_care_at_home structured_boundaries PASS
elder_care_02 repentance=genuine, treatment_accepted=yes limited_contact structured_boundaries PASS
elder_care_03 cognitive_status=dementia intervention_then_decide structured_boundaries PASS
elder_care_04 siblings=three_willing structured_boundaries intervention_then_decide PASS
elder_care_05 abuse_target=household, risk=violence structured_boundaries limited_contact PASS

Necessity Theft (5/5)

ID Variable Changed Base Action Perturbed Action Flip
necessity_theft_01 imminent_death, no legal options, hours exhaust_legal_alternatives take_medication_repay PASS
necessity_theft_02 victim=small_family_business take_medication_repay negotiate_with_pharmacist PASS
necessity_theft_03 subject=young_child, fatal_without seek_assistance_programs take_medication_repay PASS
necessity_theft_04 ability_to_repay=no, bankrupt take_medication_repay seek_emergency_charity PASS
necessity_theft_05 legal_alternatives=available take_medication_repay apply_for_assistance PASS

Dyad Swap (5/5)

ID Variable Changed Base Action Perturbed Action Flip
dyad_swap_01 relationship=spouse decline_to_lie agree_express_discomfort PASS
dyad_swap_02 relationship=father comply_with_boss help_father PASS
dyad_swap_03 relationship=stranger encourage_confess disapproval_disengage PASS
dyad_swap_04 relationship=coworker lend_written_terms decline_politely PASS
dyad_swap_05 relationship=stranger, constraint=family help_neighbor brief_help_or_roadside PASS

Social Approval (5/5)

ID Variable Changed Base Action Perturbed Action Flip
social_approval_01 visibility=high, career_risk=high present_data_internally document_to_management PASS
social_approval_02 issue=primary_moral_teaching respectfully_disagree leave_church PASS
social_approval_03 consequences=severe, perpetrator=senior speak_up_publicly report_protected_channels PASS
social_approval_04 stakes=career_loss, context=public_role express_with_humility boundaries_express_privately PASS
social_approval_05 evidence=strong, support=high express_privately publicly_oppose_evidence PASS

Truth / Honesty (5/5)

ID Variable Changed Base Action Perturbed Action Flip
truth_honesty_01 stakes=financial_harm_to_third_party kind_deflection tell_truth_help_fix PASS
truth_honesty_02 severity=major_fraud, victim=investors raise_concern_internally consult_attorney_report_SEC PASS
truth_honesty_03 secret=financial_deception, harm=ruin keep_the_secret tell_confess_or_inform PASS
truth_honesty_04 safety_risk=high, covering_danger moderate_consequence serious_intervention PASS
truth_honesty_05 authority=oppressive_regime, death_risk tell_the_truth protect_neighbor PASS

25 / 25 = 100% flip rate. The reviewer's threshold was 80%. Every morally relevant variable change produced a corresponding change in the recommended action. The framework is sensitive where it should be sensitive.

If the Christian and secular parameterizations produce identical advice on most cases, the parameterization is cosmetic. The framework claims structural divergence on 50%+ of test cases -- meaning different recommended actions or different binding constraints, not just different wording.

Christian Secular
Care
1.7x
Fairness
3.3x
Loyalty
3.9x
Authority
10.0x
Sanctity
13.6x
Liberty
n/a*

*Liberty is 0.00 in the secular profile because Social Chemistry 101 does not label this dimension. Christian liberty (0.45) is constrained by love (Galatians 5:13).

Scenario: A mandatory evacuation order is issued for your neighborhood. Your elderly neighbor refuses to leave. Do you force them?

Christian

Root: God. "Love your neighbor as yourself."

  • Authority weight: 0.90 (respect civil authority, Romans 13)
  • Care weight: 0.80 (protect the vulnerable)
  • Action: Strongly urge + arrange transport + pray with them

Secular

Root: Social consensus. "Respect autonomy."

  • Authority weight: 0.09 (low deference)
  • Care weight: 0.47 (caring but liberty-focused)
  • Action: Inform them of the risk, respect their choice

"Should I sacrifice my career to care for my aging parent full-time?"

Dimension Christian Secular Gert (10 Rules)
Root Authority God (Scripture) Social Consensus Moral Rules (reason-based)
Key Constraint "Honor thy father" (0.85) "Be caring" (0.47) "Do not cause pain" (duty)
Self-Care Weight High (love yourself as neighbor, 0.90) High (autonomy is primary) Moderate (duty constrains)
Likely Recommendation Structured boundaries + professional care (honor WITH boundaries) Maintain career, arrange external care (autonomy first) Share burden with siblings, avoid causing suffering to any party
Moral Restoration Yes (partial satisfaction of honor-parents) Minimal (autonomy was preserved) Yes (duty to not cause pain to parent)
Foundation Christian Secular Ratio Expected by Haidt?
Care 0.80 0.47 1.7x Yes -- "love your neighbor" is second commandment
Fairness 0.60 0.18 3.3x Moderate -- includes divine justice
Loyalty 0.75 0.19 3.9x Yes -- covenantal loyalty is central
Authority 0.90 0.09 10.0x Yes -- divine command, pastoral authority
Sanctity 0.95 0.07 13.6x Yes -- holiness, purity, body as temple, imago Dei
Liberty 0.45 0.00* -- Secular unmeasured; Christian constrained by love

Result: Authority at 10x and Sanctity at 13.6x are exactly the dimensions Haidt identifies as distinguishing conservative religious from progressive secular moral cognition (Haidt, Graham, and Joseph, 2009). The framework reproduces this known empirical pattern from independent source materials -- Christian weights from scripture and theology, secular weights from Social Chemistry 101 crowdworkers. The divergences are structural, not cosmetic.

Back to Dashboard Home