Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published 9 days ago • 26
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 6 items • Updated about 19 hours ago • 9
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 9 days ago • 64
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 9 days ago • 62
EgoLife Collection CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated 6 days ago • 13
Phi-4 Collection Phi-4 family of small language and multi-modal models. • 7 items • Updated 10 days ago • 109
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 14 days ago • 92
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 24 days ago • 55
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 23 days ago • 93
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 29 days ago • 93
TransPixar: Advancing Text-to-Video Generation with Transparency Paper • 2501.03006 • Published Jan 6 • 23
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published Dec 6, 2024 • 51