Model and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025]
Federico Cocchi
fede97
AI & ML interests
Multimodal LLM - Computer Vision
Recent Activity
upvoted
a
paper
2 days ago
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
liked
a Space
4 days ago
nanotron/ultrascale-playbook
updated
a collection
4 days ago
ReflectiVA
Organizations
Collections
5
models
None public yet
datasets
5
fede97/external_test_set_v1
Viewer
•
Updated
•
340
•
52
fede97/external_data_test_example_v3
Updated
•
6
fede97/external_data_test_example
Viewer
•
Updated
•
410
•
94
fede97/external_data_test_example_v2
Viewer
•
Updated
•
410
•
111
fede97/dpo_demo
Viewer
•
Updated
•
148k
•
92