Yash Thube's picture

47 13

Yash Thube

thubZ9

·

https://thubzai.github.io/

AI & ML interests

Multimodal learning • CV • RL

Recent Activity

upvoted a paper about 3 hours ago

Efficient Process Reward Model Training via Active Learning

upvoted a paper 1 day ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

updated a collection 9 days ago

My reading list!

View all activity

Organizations

thubZ9's activity

upvoted a paper about 3 hours ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published 2 days ago • 10

upvoted a paper 1 day ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 2 days ago • 196

updated a collection 9 days ago

My reading list!

18 items • Updated 9 days ago • 1

upvoted a paper 9 days ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published 22 days ago • 134

updated a collection 10 days ago

My reading list!

18 items • Updated 9 days ago • 1

upvoted a paper 12 days ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 16 days ago • 241

updated a collection 16 days ago

My reading list!

18 items • Updated 9 days ago • 1

liked a model 23 days ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated 21 days ago • 226k • • 2.62k

upvoted a paper 26 days ago

One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Paper • 2503.13358 • Published about 1 month ago • 95

updated a collection about 1 month ago

My reading list!

18 items • Updated 9 days ago • 1

upvoted a collection about 1 month ago

Cohere Labs Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated about 23 hours ago • 68

liked a model about 1 month ago

CohereLabs/aya-vision-32b

Image-Text-to-Text • Updated about 23 hours ago • 467 • • 189

upvoted a collection about 1 month ago

Gemma 3 Release

17 items • Updated 13 days ago • 331

upvoted an article about 1 month ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

Mar 4

• 73

liked a model about 1 month ago

Qwen/QwQ-32B

Text Generation • Updated Mar 11 • 746k • • 2.68k

upvoted 2 papers about 1 month ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 117

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 93

updated a collection about 1 month ago

My reading list!

18 items • Updated 9 days ago • 1

upvoted a paper about 1 month ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 76

liked a model about 2 months ago

microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated 8 days ago • 713k • 1.3k