LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 6 days ago • 31
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Paper • 2412.14171 • Published 7 days ago • 22
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 7 days ago • 43
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 13 days ago • 75
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Paper • 2412.06745 • Published 16 days ago • 6
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations Paper • 2412.08580 • Published 14 days ago • 44
Learning Flow Fields in Attention for Controllable Person Image Generation Paper • 2412.08486 • Published 14 days ago • 32
Hidden in the Noise: Two-Stage Robust Watermarking for Images Paper • 2412.04653 • Published 20 days ago • 28
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation Paper • 2412.06781 • Published 16 days ago • 18
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published 21 days ago • 43
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 21 days ago • 118
MALT: Improving Reasoning with Multi-Agent LLM Training Paper • 2412.01928 • Published 23 days ago • 39
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Paper • 2411.18613 • Published 28 days ago • 50