6 10 16

Shubham Toshniwal

stoshniwal

https://shtoshni.github.io/

shtoshni

AI & ML interests

NLP, LLM

Recent Activity

liked a dataset 13 days ago

nvidia/OpenCodeReasoning

liked a dataset 17 days ago

nvidia/Llama-Nemotron-Post-Training-Dataset

new activity 3 months ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:Tokenizer config is wrong

View all activity

Organizations

stoshniwal's activity

liked a dataset 13 days ago

nvidia/OpenCodeReasoning

Viewer • Updated 5 days ago • 753k • 10.2k • 257

liked a dataset 17 days ago

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated 4 days ago • 3.91M • 6.21k • 413

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 3 months ago

Tokenizer config is wrong

#10 opened 3 months ago by

stoshniwal

liked 2 models 5 months ago

Qwen/Qwen2.5-Math-7B-Instruct

Text Generation • Updated Sep 23, 2024 • 57k • 71

Qwen/QwQ-32B-Preview

Text Generation • Updated Jan 12 • 227k • • 1.73k

upvoted a paper 5 months ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 55

updated 4 models 5 months ago

updated a dataset 5 months ago

nvidia/OpenMathInstruct-2

Viewer • Updated Nov 25, 2024 • 22M • 8.34k • 167

upvoted a collection 5 months ago

Qwen2.5-Math

Collection

Math-specific model series based on Qwen2.5 • 11 items • Updated Jan 14 • 80

liked a model 5 months ago

nvidia/Cosmos-0.1-Tokenizer-DV4x8x8

Updated Nov 11, 2024 • 600 • 12

upvoted an article 6 months ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 53

upvoted a collection 6 months ago

Llama-3.1-Nemotron-70B

Collection

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 6 days ago • 155

New activity in nvidia/OpenMathInstruct-2 6 months ago

Upload scaling_plot.jpg

#4 opened 6 months ago by

shtoshni

New activity in nvidia/OpenMathInstruct-2 7 months ago

Unable to load dataset

#3 opened 7 months ago by

minyichen

Dataset Viewer issue: JobManagerCrashedError

#2 opened 7 months ago by

stoshniwal

liked a model 7 months ago

nvidia/NVLM-D-72B

Image-Text-to-Text • Updated Jan 14 • 14.3k • 768

upvoted a collection 7 months ago

OpenMath-2

Collection

A collection of models and datasets introduced in "OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data" • 7 items • Updated 6 days ago • 15