T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 9 days ago • 38
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published about 1 month ago • 33
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models Paper • 2503.09669 • Published Mar 12 • 35
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published Mar 10 • 31
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published Mar 7 • 44
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation Paper • 2502.08826 • Published Feb 12 • 17
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model Paper • 2502.13449 • Published Feb 19 • 45
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models Paper • 2502.12464 • Published Feb 18 • 27
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 148
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 91
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 381
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper • 2501.05874 • Published Jan 10 • 72
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published Dec 22, 2024 • 33
VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding Paper • 2412.02186 • Published Dec 3, 2024 • 22