SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion Paper • 2412.10437 • Published Dec 11, 2024 • 3
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published about 1 month ago • 92
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published Nov 21, 2024 • 22
EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations Paper • 2410.10315 • Published Oct 14, 2024 • 2
IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts Paper • 2310.05375 • Published Oct 9, 2023 • 3
Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal Paper • 2404.17808 • Published Apr 27, 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts Paper • 2407.09816 • Published Jul 13, 2024 • 1
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Paper • 2307.09323 • Published Jul 18, 2023
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting Paper • 2404.15264 • Published Apr 23, 2024
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18, 2024 • 76
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 51
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 51
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 62