Yura Choi

Yuuraa

Yuuraa

AI & ML interests

Large Multimodal Models, Video Understanding

Recent Activity

upvoted a paper 18 days ago

Magma: A Foundation Model for Multimodal AI Agents

upvoted an article 18 days ago

SmolVLM2: Bringing Video Understanding to Every Device

upvoted an article about 1 month ago

Open-source DeepResearch – Freeing our search agents

View all activity

Organizations

None yet

Yuuraa's activity

upvoted a paper 18 days ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 24 days ago • 56

upvoted an article 18 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

23 days ago

• 205

upvoted an article about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.16k

upvoted 3 papers 4 months ago

upvoted 11 papers 6 months ago

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Paper • 2409.11564 • Published Sep 17, 2024 • 20

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 37

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published Sep 18, 2024 • 44

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

Language Models Learn to Mislead Humans via RLHF

Paper • 2409.12822 • Published Sep 19, 2024 • 10

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Paper • 2409.12903 • Published Sep 19, 2024 • 22

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 138

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19, 2024 • 37

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Paper • 2409.12961 • Published Sep 19, 2024 • 25

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 62

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 108

liked a dataset 6 months ago

HuggingFaceFV/finevideo

Viewer • Updated Dec 16, 2024 • 39.5k • 9.1k • 301

upvoted 2 papers 7 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42

TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models

Paper • 2408.11318 • Published Aug 21, 2024 • 56