Chew Kok Wah's picture

Chew Kok Wah

chewkokwah

·

AI & ML interests

Open Domain Question Answering

Recent Activity

liked a Space 6 days ago

SaylorTwift/OpenEvalsDetails

upvoted a paper 7 days ago

MegaMath: Pushing the Limits of Open Math Corpora

upvoted an article 13 days ago

Visualize and understand GPU memory in PyTorch

View all activity

Organizations

chewkokwah's activity

liked a Space 6 days ago

OpenEvalsDetails

Show model details for specific benchmarks

upvoted a paper 7 days ago

MegaMath: Pushing the Limits of Open Math Corpora

Paper • 2504.02807 • Published 12 days ago • 29

upvoted an article 13 days ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

• 213

liked 2 models 14 days ago

qihoo360/Light-R1-32B-DS

Text Generation • Updated 30 days ago • 1.19k • 14

qihoo360/Light-R1-14B-DS

Text Generation • Updated 30 days ago • 2.92k • 35

liked 2 models 16 days ago

casperhansen/deepseek-r1-distill-qwen-7b-awq

Updated Feb 8 • 6.42k • 9

casperhansen/deepseek-r1-distill-qwen-14b-awq

Updated Feb 8 • 5.34k • 13

liked a model 18 days ago

kaitchup/DeepSeek-R1-Distill-Qwen-14B-AutoRound-GPTQ-4bit

Text Generation • Updated Jan 27 • 264 • 6

upvoted 2 collections about 1 month ago

Light-R1

Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated Mar 13 • 11

TinyR1

1 item • Updated Mar 5 • 3

liked a model about 2 months ago

qihoo360/TinyR1-32B-Preview

Text Generation • Updated Mar 10 • 3.44k • 327

upvoted a collection about 2 months ago

DeepSeek-R1-Distill Quantized

18 items • Updated Feb 7 • 16

upvoted a paper about 2 months ago

SIFT: Grounding LLM Reasoning in Contexts via Stickers

Paper • 2502.14922 • Published Feb 19 • 31

upvoted a paper 2 months ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 49

liked a dataset 2 months ago

Anthropic/EconomicIndex

Viewer • Updated 19 days ago • 3.36k • 3.89k • 259

New activity in open-r1/README 2 months ago

[Experiment] Applying GRPO to DeepSeek-R1-Distill-Qwen-1.5B with LIMO

#15 opened 2 months ago by

New activity in NovaSky-AI/Sky-T1-32B-Flash 3 months ago

License of Your Model

#4 opened 3 months ago by

upvoted a collection 3 months ago

FuseO1-Preview

System-II Reasoning Fusion of LLMs • 11 items • Updated 7 days ago • 22