When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27, 2024 • 25
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21, 2024 • 48
In deep reinforcement learning, a pruned network is a good network Paper • 2402.12479 • Published Feb 19, 2024 • 19
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements Paper • 2402.10963 • Published Feb 13, 2024 • 12
RLVF: Learning from Verbal Feedback without Overgeneralization Paper • 2402.10893 • Published Feb 16, 2024 • 12
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows Paper • 2402.10379 • Published Feb 16, 2024 • 31
Direct Language Model Alignment from Online AI Feedback Paper • 2402.04792 • Published Feb 7, 2024 • 31
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 115
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29, 2024 • 49
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26, 2024 • 72
Deconstructing Denoising Diffusion Models for Self-Supervised Learning Paper • 2401.14404 • Published Jan 25, 2024 • 18
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24, 2024 • 47
Large-scale Reinforcement Learning for Diffusion Models Paper • 2401.12244 • Published Jan 20, 2024 • 29
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22, 2024 • 44
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 143