FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces Paper • 2501.12909 • Published 13 days ago • 62
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper • 2501.11733 • Published 15 days ago • 27
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published Jan 2 • 13
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper • 2501.10893 • Published 17 days ago • 23
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 14 days ago • 48
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 18 days ago • 42
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 21 days ago • 272
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published Dec 23, 2024 • 43
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published Dec 23, 2024 • 21
Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published Dec 19, 2024 • 12
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 10
VisionArena: 230K Real World User-VLM Conversations with Preference Labels Paper • 2412.08687 • Published Dec 11, 2024 • 13
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper • 2412.09605 • Published Dec 12, 2024 • 28
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 139
Finance Commons Collection A large collection of multimodal financial documents in open data. • 7 items • Updated Jul 17, 2024 • 8
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13, 2024 • 98
YiZhao Dataset Collection Data and filtering models of our financial open-source YiZhao Dataset. • 5 items • Updated 26 days ago • 1
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment Paper • 2411.17188 • Published Nov 26, 2024 • 21