Shahrukh Khan's picture

Shahrukh Khan

shahrukhx01

·

https://github.com/shahrukhx01

AI & ML interests

NLP

Recent Activity

liked a model about 11 hours ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

upvoted a collection about 19 hours ago

liked a model about 19 hours ago

Zyphra/Zonos-v0.1-transformer

View all activity

Organizations

shahrukhx01's activity

upvoted a collection about 19 hours ago

Zonos-v0.1

3 items • Updated Feb 12 • 25

upvoted 2 collections 4 days ago

Ultravox v0.5

Ultravox is a multimodal Speech LLM built around different pretrained LLMs (frozen) and the whisper-large-v3-turbo (fine-tuned) backbone. • 3 items • Updated Feb 10 • 9

reranking series v2

V2 crispy rerank series • 2 items • Updated 5 days ago • 19

upvoted 2 collections 5 days ago

BD3-LMs

https://m-arriola.com/bd3lms/ • 4 items • Updated 4 days ago • 15

Gemma 3 Release

9 items • Updated 4 days ago • 263

upvoted a collection 13 days ago

Hallucination detection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated 13 days ago • 15

upvoted a collection 24 days ago

GemmaX2

GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated Feb 7 • 21

upvoted 2 collections about 2 months ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 20 days ago • 107

DeepSeek-R1

8 items • Updated Jan 21 • 579

upvoted a collection 2 months ago

KaLM-embedding

11 items • Updated 7 days ago • 23

upvoted 2 collections 3 months ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Jan 8 • 565

Common Models

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 30

upvoted 2 collections 4 months ago

UltraVox Audio Language Model Release 🔊

3 items • Updated Nov 15, 2024 • 15

Marqo-Ecommerce-Embeddings

State-of-the-art embedding models fine-tuned for the ecommerce domain. +67% increase in evaluation metrics vs ViT-B-16-SigLIP. • 10 items • Updated Nov 14, 2024 • 17

upvoted 6 collections 5 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 26 days ago • 248

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 111

CompassJudger

4 items • Updated Oct 16, 2024 • 8

glm-4-voice

3 items • Updated Oct 25, 2024 • 3

Granite 3.0 Language Models

A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 21 days ago • 96

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated Nov 21, 2024 • 47