-
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 91 -
ProgCo: Program Helps Self-Correction of Large Language Models
Paper • 2501.01264 • Published • 24 -
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Paper • 2501.01957 • Published • 33 -
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Paper • 2501.03226 • Published • 33
sergicalsix
sergicalsix
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 8 hours ago
Centurio: On Drivers of Multilingual Ability of Large Vision-Language
Model
upvoted
a
paper
about 8 hours ago
Building Foundations for Natural Language Processing of Historical
Turkish: Resources and Models
Organizations
Collections
1
spaces
1
models
None public yet