view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 • 171
ReVideo: Remake a Video with Motion and Content Control Paper • 2405.13865 • Published May 22, 2024 • 23
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16, 2024 • 127
LITA: Language Instructed Temporal-Localization Assistant Paper • 2403.19046 • Published Mar 27, 2024 • 18
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Paper • 2312.12491 • Published Dec 19, 2023 • 69