Nishith Jain's picture

Nishith Jain

KingNish

·

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

liked a model about 12 hours ago

sand-ai/MAGI-1

updated a model about 14 hours ago

KingNish/Smollm-135M-audio

liked a Space about 16 hours ago

nanotron/ultrascale-playbook

View all activity

Organizations

KingNish's activity

upvoted an article 3 days ago

Article

17 Reasons Why Gradio Isn't Just Another UI Library

7 days ago

• 17

upvoted a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 9 days ago • 239

upvoted a collection 12 days ago

indic-evals

Translated versions of popular LLM benchmarks. • 4 items • Updated Oct 23, 2024 • 5

upvoted a collection 13 days ago

Orpheus Multilingual Research Release

Beta Release of multilingual models. • 12 items • Updated 12 days ago • 76

upvoted a paper 14 days ago

Rethinking Reflection in Pre-Training

Paper • 2504.04022 • Published 18 days ago • 76

upvoted a paper 26 days ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published 29 days ago • 47

upvoted a collection 27 days ago

SANA-Sprint

🏃SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation • 6 items • Updated 6 days ago • 35

upvoted 2 collections about 1 month ago

SuperBPE

SuperBPE tokenizers and models trained with them • 8 items • Updated 13 days ago • 14

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18 • 60

upvoted a paper about 1 month ago

KBLaM: Knowledge Base augmented Language Model

Paper • 2410.10450 • Published Oct 14, 2024 • 2

upvoted a collection about 1 month ago

SANA-1.5

SANA-1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer • 6 items • Updated 6 days ago • 4

upvoted 2 papers about 1 month ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 226

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Paper • 2503.10460 • Published Mar 13 • 28

upvoted 2 collections about 1 month ago

Gemma 3 Release

24 items • Updated 5 days ago • 341

EuroBERT

Scaling Multilingual Encoders for European Languages • 4 items • Updated Mar 10 • 11

upvoted a paper about 2 months ago

Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing

Paper • 2502.14458 • Published Feb 20 • 2

upvoted an article about 2 months ago

Article

Fine-Tune Whisper with 🤗 Transformers

Nov 3, 2022

• 217

upvoted a paper about 2 months ago

SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published Feb 27 • 22