Shuhuai Ren's picture

Shuhuai Ren

ShuhuaiRen

·

https://renshuhuai-andy.github.io/

AI & ML interests

NLP, Multi-modal

Organizations

upvoted a collection 3 months ago

MiMo-Audio

5 items • Updated 15 days ago • 25

upvoted a paper 4 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 84

upvoted a collection 5 months ago

MiMo-VL

6 items • Updated 15 days ago • 38

upvoted a paper 7 months ago

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4, 2025 • 80

upvoted a paper 8 months ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12, 2025 • 82

upvoted a paper 9 months ago

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20, 2025 • 34

upvoted 2 papers 11 months ago

Next Block Prediction: Video Generation via Semi-Autoregressive Modeling

Paper • 2502.07737 • Published Feb 11, 2025 • 9

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Paper • 2502.06788 • Published Feb 10, 2025 • 13

upvoted a paper about 1 year ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 53

upvoted 2 papers over 1 year ago

M^3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

Paper • 2306.04387 • Published Jun 7, 2023 • 8

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Paper • 2405.21075 • Published May 31, 2024 • 26

upvoted a paper almost 2 years ago

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Paper • 2312.02051 • Published Dec 4, 2023 • 1