VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published 10 days ago • 42
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper • 2504.07615 • Published 11 days ago • 29
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published 9 days ago • 119
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published 14 days ago • 116
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors Paper • 2504.11427 • Published 5 days ago • 16
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models Paper • 2504.11468 • Published 10 days ago • 25
Cobra: Efficient Line Art COlorization with BRoAder References Paper • 2504.12240 • Published 4 days ago • 25
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published 10 days ago • 45
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published 5 days ago • 54
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published 5 days ago • 13
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 101
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published 18 days ago • 33
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published 13 days ago • 59
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought Paper • 2504.05599 • Published 13 days ago • 79
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 12 days ago • 101
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 12 days ago • 145