Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 354
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 555
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 90
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 701
Datasets for Pretrained Thai LLM Collection List Datasets for pretrained Thai LLM by PyThaiNLP • 23 items • Updated Sep 12, 2024 • 9
BiPhone: Modeling Inter Language Phonetic Influences in Text Paper • 2307.03322 • Published Jul 6, 2023 • 7