BuiDoan
's Collections
Great paper
updated
Paper
•
2410.05258
•
Published
•
169
PaliGemma 2: A Family of Versatile VLMs for Transfer
Paper
•
2412.03555
•
Published
•
121
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper
•
2412.04467
•
Published
•
105
o1-Coder: an o1 Replication for Coding
Paper
•
2412.00154
•
Published
•
42
SNOOPI: Supercharged One-step Diffusion Distillation with Proper
Guidance
Paper
•
2412.02687
•
Published
•
108
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any
Point in Long Video
Paper
•
2411.18671
•
Published
•
20
Fully Open Source Moxin-7B Technical Report
Paper
•
2412.06845
•
Published
•
10
Small Language Models: Survey, Measurements, and Insights
Paper
•
2409.15790
•
Published
•
1
Paper
•
2407.10671
•
Published
•
160
Paper
•
2412.08905
•
Published
•
100
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
•
2412.10360
•
Published
•
136
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
•
2412.09871
•
Published
•
85
Paper
•
2412.15115
•
Published
•
339