2 19 61

Dmitry Abulkhanov

mponty

mponty

AI & ML interests

None yet

Recent Activity

updated a collection 4 days ago

Agentic rollouts

updated a collection 4 days ago

Agentic rollouts

updated a collection 4 days ago

Agentic rollouts

View all activity

Organizations

updated a collection 4 days ago

Agentic rollouts

Collection

6 items • Updated 4 days ago

upvoted a collection 4 days ago

Nemotron Agentic & Tool-Use

Collection

Datasets for building models capable of function calling, multi-step agentic tasks, terminal use, and SWE workflows. • 9 items • Updated 5 days ago • 7

updated a collection 4 days ago

Agentic rollouts

Collection

6 items • Updated 4 days ago

liked a dataset 4 days ago

yoonholee/terminalbench-trajectories

Viewer • Updated Mar 9 • 52.1k • 640 • 7

upvoted an article 2 months ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 154

published 2 models 3 months ago

spec-diffusion/DEBUG-QWEN-05B-lr9e5-ep20-bs8-w16-pc005-GSM8K-76ge2r

Updated Mar 19, 2025

spec-diffusion/deepseek-ai-DeepSeek-R1-Distill-Qwen-7B-lr9e5-ep20-bs8-w16-pc005-GSM8K

Updated Mar 19, 2025

upvoted 2 articles 5 months ago

Article

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

exploding-gradients

•

Sep 16, 2025

• 20

Article

4D masks support in Transformers

poedator

•

Jan 8, 2024

• 31

liked a model 5 months ago

zai-org/GLM-4.6V-Flash

Image-Text-to-Text • 10B • Updated Dec 9, 2025 • 34.6k • • 602

upvoted a collection 5 months ago

👩‍💻 OlympicCoder

Collection

Reasoning datasets and models for competitive coding • 6 items • Updated Dec 7, 2025 • 20

upvoted an article 5 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 395

reacted to danielhanchen's post with 🔥 5 months ago

Post

2236

Mistral's new SOTA coding models Devstral 2 can now be Run locally! (25GB RAM) 🐱
We fixed the chat template, so performance should be much better now!
24B: unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
123B: unsloth/Devstral-2-123B-Instruct-2512-GGUF

🧡Step-by-step Guide: https://docs.unsloth.ai/models/devstral-2

authored a paper 5 months ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published Dec 11, 2025 • 119

upvoted an article 6 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 311

liked a model 9 months ago

bigcode/starcoder

Text Generation • 16B • Updated Oct 8, 2024 • 22.7k • 2.95k

Dmitry Abulkhanov

AI & ML interests

Recent Activity

Organizations

mponty's activity

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

4D masks support in Transformers

Continuous batching from first principles

Transformers v5: Simple model definitions powering the AI ecosystem