Tending Towards Stability: Convergence Challenges in Small Language Models Paper • 2410.11451 • Published Oct 15
Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus Paper • 2406.08598 • Published Jun 12 • 5
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale Paper • 2211.03759 • Published Nov 7, 2022
Contrastive Language-Image Pre-training for the Italian Language Paper • 2108.08688 • Published Aug 19, 2021 • 2
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models Paper • 2308.01263 • Published Aug 2, 2023
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions Paper • 2309.07875 • Published Sep 14, 2023
When and why vision-language models behave like bags-of-words, and what to do about it? Paper • 2210.01936 • Published Oct 4, 2022
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18 • 10
AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets Paper • 2404.05623 • Published Apr 8 • 3
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models Paper • 2308.01263 • Published Aug 2, 2023
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions Paper • 2309.07875 • Published Sep 14, 2023
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features Paper • 2309.07733 • Published Sep 14, 2023
A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation Paper • 2310.12127 • Published Oct 18, 2023 • 1
Diable: Efficient Dialogue State Tracking as Operations on Tables Paper • 2305.17020 • Published May 26, 2023
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists Paper • 2203.09192 • Published Mar 17, 2022