Submitted by hadasor 42 LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations · 7 authors 2
Submitted by akhaliq 26 VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide · 4 authors 3
Submitted by ysu-nlp 18 ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery · 20 authors 2
Submitted by ysu-nlp 16 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents · 8 authors 2
Submitted by Njb 15 Presto! Distilling Steps and Layers for Accelerating Music Generation · 6 authors 4
Submitted by deqing 14 TLDR: Token-Level Detective Reward Model for Large Vision Language Models · 8 authors 2
Submitted by demolei 13 MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs · 9 authors 2
Submitted by Junyi42 13 MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion · 8 authors 3
Submitted by mfarajtabar 12 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models · 6 authors 2
Submitted by qq8933 11 LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning · 12 authors 2
Submitted by lilelife 8 OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction · 9 authors 2
Submitted by penfever 7 SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification · 6 authors 2
Submitted by Duguce 7 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles · 8 authors 2
Submitted by thuhsy 6 Autonomous Character-Scene Interaction Synthesis from Text Instruction · 7 authors 2
Submitted by RaphaelLiu 4 Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach · 8 authors 2
Submitted by DwanZhang 4 SePPO: Semi-Policy Preference Optimization for Diffusion Alignment · 11 authors 2
Submitted by ZinengTang 3 Grounding Language in Multi-Perspective Referential Communication · 3 authors 2
Submitted by zheweiyao 1 SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation · 4 authors 2