Srulik ben David's picture
3

Srulik ben David

Srulikbdd
Β·

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face Discord Community's profile picture

Srulikbdd's activity

replied to kz919's post 6 months ago
view reply

thanks!!
awesome.
Did u check if he choose a lot of random moves?

replied to kz919's post 6 months ago
view reply

he is really bad but its cool!
whats the code behind it?

reacted to kz919's post with πŸ”₯🧠 6 months ago
reacted to lamhieu's post with πŸ”₯ 9 months ago
view post
Post
2925
Wow, this is amazing! 🀯
Samba is a powerful hybrid model with an unlimited context length, combining Mamba, MLP, Sliding Window Attention, and MLP stacking. Samba largest version, Samba-3.8B, trained on 3.2 trillion tokens, excels in benchmarks like MMLU, GSM8K, and HumanEval, and shines in long-context tasks with minimal tuning.
---
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Github: https://github.com/microsoft/Samba
reacted to ordagan's post with πŸ”₯πŸš€ 12 months ago
view post
Post
2182
Excited to introduce Jamba by AI21
ai21labs/Jamba-v0.1

We are thrilled to announce Jamba, the world’s first production-grade Mamba based model.

Key Features:
- First production-grade Mamba based model built on a novel SSM-Transformer hybrid architecture
- 3X throughput on long contexts compared to Mixtral 8x7B
- Democratizes access to a massive 256K context window
- The only model in its size class that fits up to 140K context on a single GPU

Jamba is based on a novel architecture that combines Mamba and Transformer. While our initial results show great efficiency gains, we expect this to be further explored and improved with the help of the community.

Check out our blog post for more info: https://ai21-labs.webflow.io/blog/announcing-jamba
  • 2 replies
Β·