Mishig Davaadorj's picture

Mishig Davaadorj

mishig

·

AI & ML interests

NP-completeness, grammars, universality

Recent Activity

liked a Space about 1 hour ago

enzostvs/deepsite

published an article 1 day ago

The NLP Course is becoming the LLM Course!

updated a Space 2 days ago

huggingface/inference-playground

View all activity

Organizations

mishig's activity

upvoted a paper 7 days ago

Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 7

upvoted a paper 22 days ago

Sparse Autoencoders Find Highly Interpretable Features in Language Models

Paper • 2309.08600 • Published Sep 15, 2023 • 15

upvoted 2 articles about 1 month ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 170

Article

Remote VAEs for decoding with HF endpoints 🤗

Feb 24

• 37

upvoted 2 papers about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 178

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 150

upvoted a collection about 2 months ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Feb 20 • 50

upvoted an article about 2 months ago

Article

State of open video generation models in Diffusers

Jan 27

• 50

upvoted a paper about 2 months ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 29

upvoted a collection about 2 months ago

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated Feb 6 • 52

upvoted a paper about 2 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 217

upvoted 2 papers 2 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 114

Deep Learning Scaling is Predictable, Empirically

Paper • 1712.00409 • Published Dec 1, 2017 • 1

upvoted a collection 3 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated about 14 hours ago • 279

upvoted a collection 4 months ago

Hymba

A series of Hybrid Small Language Models. • 2 items • Updated about 14 hours ago • 29

upvoted a paper 5 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 81