Dmitry Balobin's picture

Dmitry Balobin

d0rj

·

AI & ML interests

NLP and 🥴 tensors. MIPT 💙, 2GIS 💚

Recent Activity

liked a model 3 days ago

nyuuzyou/SmolLM2-135M-Eagle

liked a dataset 3 days ago

Luckyjhg/Geo170K

updated a collection 3 days ago

Math Instruct datasets in Russian

View all activity

Organizations

None yet

d0rj's activity

upvoted a collection 3 days ago

blt

4 items • Updated 3 days ago • 15

upvoted a collection 9 days ago

SANA-Sprint

🏃SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation • 6 items • Updated 4 days ago • 35

upvoted a collection 21 days ago

Sana

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated 4 days ago • 90

upvoted a collection 24 days ago

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 6

upvoted a collection 28 days ago

SuperBPE

SuperBPE tokenizers and models trained with them • 8 items • Updated 11 days ago • 14

upvoted 3 collections about 1 month ago

Reasoning Dataset

7 items • Updated Mar 7 • 3

Datasets [RU]

SFT / RL high-quality datasets • 9 items • Updated 3 days ago • 2

Gemma 3 Release

24 items • Updated 3 days ago • 338

upvoted a paper about 1 month ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 229

upvoted a paper 2 months ago

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published Feb 10 • 90

upvoted a collection 3 months ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 21 items • Updated 6 days ago • 127

upvoted a paper 3 months ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94

upvoted a collection 3 months ago

Ru Dialogue Benchmarks

A collection of benchmarks for evaluating the quality of dialogue models in Russian. • 3 items • Updated Jan 15 • 2

upvoted 2 papers 4 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 101

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 39

upvoted a collection 4 months ago

T-pro-1.0

5 items • Updated Jan 15 • 6

upvoted a collection 6 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated Feb 20 • 252

upvoted a paper 7 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10, 2024 • 69

upvoted 2 collections 7 months ago

SAGE v1.1.0 release

4 items • Updated 26 days ago • 5

WebInstruct 🌐 Embeddings 🧱 Models

A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated Sep 4, 2024 • 11