Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Yansong Shi's picture
10 3 3

Yansong Shi

nanamma
0xSojalSec's profile picture Prettykittycat35's profile picture TheoW's profile picture
·
https://huggingface.co/nanamma

AI & ML interests

multi modality, video understanding, robotics

Organizations

OpenGVLab's profile picture

authored a paper 2 months ago

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Paper • 2603.03985 • Published Mar 4 • 6
submitted a paper to Daily Papers 2 months ago

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Paper • 2603.03985 • Published Mar 4 • 6
authored 2 papers 8 months ago

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Paper • 2403.15377 • Published Mar 22, 2024 • 29

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Paper • 2410.19702 • Published Oct 25, 2024 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs