Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 5 days ago • 20
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published 3 days ago • 28
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 3 days ago • 52
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models Paper • 2501.02955 • Published 6 days ago • 39
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 13 days ago • 23
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 11 days ago • 91
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published 19 days ago • 65
MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes Paper • 2412.19260 • Published 17 days ago • 1
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 18 days ago • 89
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models Paper • 2412.18609 • Published 19 days ago • 15
Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation Paper • 2412.18176 • Published 19 days ago • 15
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published 21 days ago • 28
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 27 days ago • 41
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 25 days ago • 121
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published about 1 month ago • 85
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper • 2412.09501 • Published about 1 month ago • 44