Societal Alignment Frameworks Can Improve LLM Alignment
Abstract
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.
Community
Human alignment balances social expectations, economic incentives, and legal frameworks. What if LLM alignment worked the same way? Our latest work explores how social, economic, and contractual alignment can address incomplete contracts in LLM alignment.
LLM alignment remains a challenge because human values are complex, dynamic, and often conflict with narrow optimization goals. Existing methods like RLHF struggle with misspecified objectives.
We propose leveraging societal alignment frameworks to guide LLM alignment:
🔹 Social alignment: Modeling norms, values & cultural competence
🔹 Economic alignment: Fair reward mechanisms & collective decision-making
🔹 Contractual alignment: Legal principles for LLMs
Instead of perfecting rigid alignment objectives, we explore how LLMs can navigate uncertainty—a feature, not a flaw!
We also discuss the role of participatory alignment, where diverse stakeholders help shape LLM behavior rather than deferring solely to designers.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Policy-as-Prompt: Rethinking Content Moderation in the Age of Large Language Models (2025)
- Scopes of Alignment (2025)
- Advantage-Guided Distillation for Preference Alignment in Small Language Models (2025)
- C3AI: Crafting and Evaluating Constitutions for Constitutional AI (2025)
- Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models (2025)
- STAIR: Improving Safety Alignment with Introspective Reasoning (2025)
- Token Democracy: The Architectural Limits of Alignment in Transformer-Based Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper