SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published 19 days ago • 81
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 951
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 37
Parallel Sentences Datasets Collection These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated Feb 25 • 15
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 733