Submitted by LXT 51 OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding · 8 authors 6
Submitted by xinlai 40 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs · 6 authors 2
Submitted by multimodalart 33 MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data · 2 authors 3
Submitted by TranSirius 29 SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation · 8 authors 1
Submitted by TranSirius 24 Aligning Teacher with Student Preferences for Tailored Training Data Generation · 6 authors 2
Submitted by davanstrien 18 LiveBench: A Challenging, Contamination-Free LLM Benchmark · 15 authors 2
Submitted by Foxfi 13 MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression · 13 authors 4
Submitted by akhaliq 11 AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models · 12 authors 4
Submitted by xw-eric 9 Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding · 9 authors 2
Submitted by mbrack 8 T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings · 5 authors 5
Submitted by dongguanting 5 Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation · 6 authors 5
Submitted by ahmedheakl 5 ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs · 5 authors 5
Submitted by ahmedheakl 3 ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models · 5 authors 3