view article Article π¦Έπ»#7: From Agentic AI to Physical AI By Kseniase β’ about 14 hours ago β’ 3
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published 8 days ago β’ 72
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper β’ 2501.04682 β’ Published 4 days ago β’ 67
TACO Models Collection This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. β’ 3 items β’ Updated 22 days ago β’ 8
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper β’ 2412.19723 β’ Published 16 days ago β’ 78
Cosmos World Foundation Model Platform for Physical AI Paper β’ 2501.03575 β’ Published 5 days ago β’ 54
My Loras Collection loras i've made that are hosted here β’ 11 items β’ Updated about 9 hours ago β’ 5
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 4 days ago β’ 184
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 25 days ago β’ 121
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling β’ 3 items β’ Updated 24 days ago β’ 122
The Open Source Advantage in Large Language Models (LLMs) Paper β’ 2412.12004 β’ Published 27 days ago β’ 9
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper β’ 2412.09604 β’ Published about 1 month ago β’ 35
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published 30 days ago β’ 136
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper β’ 2412.08737 β’ Published Dec 11, 2024 β’ 52
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper β’ 2412.09596 β’ Published about 1 month ago β’ 92
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper β’ 2412.08443 β’ Published Dec 11, 2024 β’ 38