view article Article Train 400x faster Static Embedding Models with Sentence Transformers 11 days ago • 121
view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 108
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published Jul 1, 2024 • 40
Wavelets Are All You Need for Autoregressive Image Generation Paper • 2406.19997 • Published Jun 28, 2024 • 30
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18, 2024 • 43
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29, 2024 • 50