AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 7 days ago • 56
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published 18 days ago • 79
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models Paper • 2503.18033 • Published 9 days ago • 23
A Survey on Large Language Model based Autonomous Agents Paper • 2308.11432 • Published Aug 22, 2023 • 3
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published 26 days ago • 15
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published Feb 12 • 9
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models Paper • 2409.04787 • Published Sep 7, 2024 • 1
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Paper • 2502.08468 • Published Feb 12 • 13
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Paper • 2502.09560 • Published Feb 13 • 35
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights Paper • 2502.09619 • Published Feb 13 • 33
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 95
JuStRank: Benchmarking LLM Judges for System Ranking Paper • 2412.09569 • Published Dec 12, 2024 • 20
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI Paper • 2401.14019 • Published Jan 25, 2024 • 23
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published May 2, 2024 • 56