MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models Paper • 2410.17578 • Published Oct 23, 2024 • 1
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Paper • 2406.05761 • Published Jun 9, 2024 • 2
Knowledge Unlearning for Mitigating Privacy Risks in Language Models Paper • 2210.01504 • Published Oct 4, 2022
Gradient Ascent Post-training Enhances Language Model Generalization Paper • 2306.07052 • Published Jun 12, 2023
LangBridge: Multilingual Reasoning Without Multilingual Supervision Paper • 2401.10695 • Published Jan 19, 2024 • 5