TabiBench Collection Tabi Benchmark for Language Evaluation. This benchmark includes 28 Turkish fine-tuning datasets. Codebase: https://github.com/boun-tabi-LMG/TabiBERT • 28 items • Updated 13 days ago • 4
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 50