Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Nov 14 • 535
mHuBERT-147 models Collection Compact yet powerful multilingual speech representation models based on the HuBERT architecture. • 3 items • Updated Jun 4 • 6
LLaVa-NeXT Collection LLaVa-NeXT (also known as LLaVa-1.6) improves upon the 1.5 series by incorporating higher image resolutions and more reasoning/OCR datasets. • 8 items • Updated Jul 19 • 27
Benchmarking Representations for Speech, Music, and Acoustic Events Paper • 2405.00934 • Published May 2 • 1
ARCH models, benchmark and paper Collection This collection contains pre-trained models on the AudioSet dataset, offering a diverse set of features for audio representation learning. • 6 items • Updated Jun 21 • 1
XLSR Collection A collection of multilingual Wav2Vec 2.0 checkpoints pre-trained on 53 languages and fine-tuned for CTC speech recognition. • 12 items • Updated Jan 16 • 6
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 62
🇮🇹 Italian NLP Resources Collection Collection of models, datasets and demos relevant to Italian NLP 🇮🇹 • 268 items • Updated 5 days ago • 24
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 84
AudioPaLM: A Large Language Model That Can Speak and Listen Paper • 2306.12925 • Published Jun 22, 2023 • 53