5 5

Shengqiong Wu

ChocoWu

https://chocowu.github.io/

ChocoWu

AI & ML interests

Large Language Model, Multimodal learning, Scene graph Generation

Recent Activity

updated a dataset about 10 hours ago

General-Level/General-Bench-Openset

updated a dataset about 11 hours ago

General-Level/General-Bench-Closeset

upvoted a paper 2 days ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

View all activity

Organizations

ChocoWu's activity

updated a dataset about 10 hours ago

General-Level/General-Bench-Openset

Viewer • Updated about 10 hours ago • 200

updated a dataset about 11 hours ago

General-Level/General-Bench-Closeset

Updated about 11 hours ago • 36

upvoted a paper 2 days ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Paper • 2503.23377 • Published 8 days ago • 42

authored a paper 5 days ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 6 days ago • 68

commented 3 papers 5 days ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 6 days ago • 68 •

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 6 days ago • 68 •

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 6 days ago • 68 •

upvoted a paper 5 days ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 6 days ago • 68

updated a model 9 days ago

ChocoWu/test-model

Updated 9 days ago

upvoted a paper 13 days ago

Position: Interactive Generative Video as Next-Generation Game Engine

Paper • 2503.17359 • Published 16 days ago • 61

authored a paper 20 days ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 21 days ago • 32

upvoted a paper 20 days ago

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published 21 days ago • 32

published a model 3 months ago

ChocoWu/test-model

Updated 9 days ago

authored a paper 3 months ago

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Paper • 2412.19806 • Published Oct 8, 2024 • 1

New activity in Bin1117/AnyEdit 4 months ago

wrong format of data

#2 opened 4 months ago by

ChocoWu

upvoted a paper 9 months ago

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 54

authored a paper 9 months ago

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 54

commented a paper about 1 year ago

NExT-GPT: Any-to-Any Multimodal LLM

Paper • 2309.05519 • Published Sep 11, 2023 • 78 •

New activity in lmsys/vicuna-7b-v1.5 over 1 year ago

Garbled characters from Vicuna 7b-v1.5

#10 opened over 1 year ago by

ChocoWu

New activity in ChocoWu/nextgpt_7b_tiva_v0 over 1 year ago

Not able to load the model using transformers

#1 opened over 1 year ago by

Prajwal231