CLEAR: Character Unlearning in Textual and Visual Modalities Paper • 2410.18057 • Published 24 days ago • 198
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation Paper • 2410.23090 • Published 17 days ago • 53
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 16 days ago • 58
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published 12 days ago • 44
Balancing Pipeline Parallelism with Vocabulary Parallelism Paper • 2411.05288 • Published 8 days ago • 18
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 9 days ago • 100
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models Paper • 2411.07232 • Published 5 days ago • 54
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 14