Joseph

Joseph717171

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face Discord Community's profile picture

Joseph717171's activity

reacted to merterbak's post with πŸ€—πŸ”₯ 4 days ago
view post
Post
2908
Meta has unveiled its Llama 4 πŸ¦™ family of models, featuring native multimodality and mixture-of-experts architecture. Two model families are available now:
ModelsπŸ€—: meta-llama/llama-4-67f0c30d9fe03840bc9d0164
Blog Post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/
HF's Blog Post: https://huggingface.co/blog/llama4-release

- 🧠 Native Multimodality - Process text and images in a unified architecture
- πŸ” Mixture-of-Experts - First Llama models using MoE for incredible efficiency
- πŸ“ Super Long Context - Up to 10M tokens
- 🌐 Multilingual Power - Trained on 200 languages with 10x more multilingual tokens than Llama 3 (including over 100 languages with over 1 billion tokens each)

πŸ”Ή Llama 4 Scout
- 17B active parameters (109B total)
- 16 experts architecture
- 10M context window
- Fits on a single H100 GPU
- Beats Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1

πŸ”Ή Llama 4 Maverick
- 17B active parameters (400B total)
- 128 experts architecture
- It can fit perfectly on DGX H100(8x H100)
- 1M context window
- Outperforms GPT-4o and Gemini 2.0 Flash
- ELO score of 1417 on LMArena currently second best model on arena

πŸ”Ή Llama 4 Behemoth (Coming Soon)
- 288B active parameters (2T total)
- 16 experts architecture
- Teacher model for Scout and Maverick
- Outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks
New activity in huggingchat/chat-ui 5 days ago

[MODELS] Discussion

711
#372 opened about 1 year ago by
victor