MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization Paper • 2407.08818 • Published Jul 11, 2024
Steering off Course: Reliability Challenges in Steering Language Models Paper • 2504.04635 • Published 14 days ago
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control Paper • 2210.17432 • Published Oct 31, 2022 • 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 64
RewardBench: Evaluating Reward Models for Language Modeling Paper • 2403.13787 • Published Mar 20, 2024 • 23
The Art of Saying No: Contextual Noncompliance in Language Models Paper • 2407.12043 • Published Jul 2, 2024 • 4
The Art of Saying No: Contextual Noncompliance in Language Models Paper • 2407.12043 • Published Jul 2, 2024 • 4