Beyond Homo Economicus: Computational Verification of Sen's Meta-Ranking Theory via Multi-Agent Reinforcement Learning
This study formalizes Amartya Sen's theory of "Meta-Ranking"—preferences over preferences—within a Multi-Agent Reinforcement Learning (MARL) framework. We simulated agents with seven Social Value Orientations (SVO) across three environments (Cleanup, IPD, PGG) at scales up to 100 agents.
where λ_t dynamically modulates between self-interest and social commitment based on resource levels, implementing Sen's insight that commitment is impossible under extreme deprivation.
| # | Finding | Evidence |
|---|---|---|
| 1 | Dynamic meta-ranking enhances collective welfare | p=0.0003 |
| 2 | Emergent role specialization (Cleaners vs Eaters) | p<0.0001 at 100 agents |
| 3 | "Situational Commitment" → ESS at ~12% | Replicator dynamics |
| 4 | Individualist SVO (15°) best matches humans | WD=0.053 |
| 5 | SVO rotation = 86% of mechanism | Full factorial |
| Domain | Implication |
|---|---|
| 🤖 AI Alignment | Systems should learn when to be moral, not encode static values |
| 📊 Behavioral Economics | Bounded self-interest (θ=15°), not pure altruism, best replicates humans |
| 🧬 Evolutionary Theory | A "Moral Minority" of ~12% suffices as an ESS |