Sarthak Thakur's picture

Sarthak Thakur

sarthak247

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

HiDream-ai/HiDream-I1-Dev

upvoted a paper 1 day ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

upvoted a paper 1 day ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

View all activity

Organizations

sarthak247's activity

liked a model 1 day ago

HiDream-ai/HiDream-I1-Dev

Text-to-Image • Updated 2 days ago • 4.67k • • 82

upvoted 6 papers 1 day ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published 6 days ago • 66

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 7 days ago • 97

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published 8 days ago • 92

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Paper • 2504.07964 • Published 5 days ago • 58

Kimi-VL Technical Report

Paper • 2504.07491 • Published 5 days ago • 108

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 14 days ago • 72

upvoted a paper 2 days ago

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Paper • 2411.17525 • Published Nov 26, 2024 • 3

liked 5 models 4 days ago

Skywork/SkyReels-A2

Updated 7 days ago • 544 • 110

ByteDance/MegaTTS3

Text-to-Speech • Updated 11 days ago • 2.94k • 343

moonshotai/Kimi-VL-A3B-Thinking

Image-Text-to-Text • Updated 1 day ago • 10.2k • 303

HiDream-ai/HiDream-I1-Full

Text-to-Image • Updated 2 days ago • 16k • • 437

rasbt/llama-3.2-from-scratch

Updated 14 days ago • 252

updated a collection 8 days ago

Gemma-3-1B-GRPO

Gemma 3 (1B) model with GRPO training • 2 items • Updated 8 days ago

updated a model 8 days ago

sarthak247/gemma-3-1B-GRPO-float16

Text Generation • Updated 8 days ago • 1

published a model 8 days ago

sarthak247/gemma-3-1B-GRPO-float16

Text Generation • Updated 8 days ago • 1

updated a model 8 days ago

sarthak247/gemma-3-1B-GRPO-Adapter

Updated 8 days ago

published a model 8 days ago

sarthak247/gemma-3-1B-GRPO-Adapter

Updated 8 days ago

liked a model 9 days ago

meta-llama/Llama-4-Scout-17B-16E-Instruct

Image-Text-to-Text • Updated 6 days ago • 657k • • 777