Post
935
🚀🚀🚀Introducing Insight-V! An early attempt towards o1-like multi-modal reasoning.
We offer a structured long-chain visual reasoning data generation pipeline and a multi-agent system to unleash the reasoning potential of MLLMs.
📜 Paper: https://arxiv.org/abs/2411.14432
🛠️ Github: https://github.com/dongyh20/Insight-V
💼 Model Weight: THUdyh/insight-v-673f5e1dd8ab5f2d8d332035
We offer a structured long-chain visual reasoning data generation pipeline and a multi-agent system to unleash the reasoning potential of MLLMs.
📜 Paper: https://arxiv.org/abs/2411.14432
🛠️ Github: https://github.com/dongyh20/Insight-V
💼 Model Weight: THUdyh/insight-v-673f5e1dd8ab5f2d8d332035