LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper โข 2411.10440 โข Published Nov 15, 2024 โข 113
VideoAgent: Long-form Video Understanding with Large Language Model as Agent Paper โข 2403.10517 โข Published Mar 15, 2024 โข 33
view post Post Multi-Instance Generation Controller: Enjoy complete control over position generation, attribute determination, and count! code link: https://github.com/limuloo/MIGCproject page: https://migcproject.github.io/MIGC decouples multi-instance generation into individual single-instance generation subtasks within the cross-attention layer of Stable Diffusion. Welcome to follow our project and use the code to create anything you imagine!Please let us know if you have any suggestions! 6 replies ยท ๐ 23 23 โค๏ธ 14 14 ๐ค 6 6 ๐ค 5 5 ๐ 1 1 ๐คฏ 1 1 + Reply
view post Post Multi-Instance Generation Controller: Enjoy complete control over position generation, attribute determination, and count! code link: https://github.com/limuloo/MIGCproject page: https://migcproject.github.io/MIGC decouples multi-instance generation into individual single-instance generation subtasks within the cross-attention layer of Stable Diffusion. Welcome to follow our project and use the code to create anything you imagine!Please let us know if you have any suggestions! 6 replies ยท ๐ 23 23 โค๏ธ 14 14 ๐ค 6 6 ๐ค 5 5 ๐ 1 1 ๐คฏ 1 1 + Reply
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Paper โข 2402.06149 โข Published Feb 9, 2024 โข 18
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Paper โข 2402.06149 โข Published Feb 9, 2024 โข 18