Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 27 days ago • 96
Constructing and Expanding Low-Resource and Underrepresented Parallel Datasets for Indonesian Local Languages Paper • 2404.01009 • Published Apr 1, 2024
Impact of Multilingual Alignment - Alignment Dataset Collection This is the collection of restructured word level alignment, that restructured for ease the analysis section. • 5 items • Updated about 1 month ago
Lius - Translation Models Collection Collection An Effort to build LLM based translation models for the Malay Kupang Language. • 13 items • Updated Jan 26 • 1