arxiv:2412.02611
Shijia Yang
shijiay
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
21 days ago
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
Audio-Visual Information?
authored
a paper
22 days ago
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
Audio-Visual Information?
Organizations
None yet
models
27
shijiay/llava_clip224_stage1
Image-Text-to-Text
•
Updated
•
7
shijiay/llava_clip224_stage2
Image-Text-to-Text
•
Updated
•
13
shijiay/llava_dinov2_stage2
Image-Text-to-Text
•
Updated
•
11
•
1
shijiay/llava_clip_stage1
Image-Text-to-Text
•
Updated
•
8
shijiay/llava_clip_stage2
Image-Text-to-Text
•
Updated
•
17
shijiay/llava_openclip_stage1
Image-Text-to-Text
•
Updated
•
7
shijiay/llava_openclip_stage2
Image-Text-to-Text
•
Updated
•
13
shijiay/llava_siglip_stage1
Image-Text-to-Text
•
Updated
•
7
shijiay/llava_siglip_stage2
Image-Text-to-Text
•
Updated
•
13
shijiay/llava_sdim_stage1
Image-Text-to-Text
•
Updated
•
5
datasets
None public yet