Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Paper β’ 2501.08326 β’ Published 26 days ago β’ 31
view article Article Halo: Open Source Health Tracking with Wearables By cyrilzakka β’ Nov 19, 2024 β’ 105
Running 18 18 Hugging Face Values π€ Empower users to use machine learning through an open collaboration platform
view article Article Design choices for Vision Language Models in 2024 By gigant β’ Apr 16, 2024 β’ 26
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x β’ Jun 23, 2024 β’ 70
Running 540 540 Vision Arena (Testing VLMs side-by-side) πΌ Analyze images to detect and label objects