Yohan Na's picture

Yohan Na PRO

nayohan

·

nayohan

AI & ML interests

NLP, Dialogue systems

Recent Activity

upvoted a paper 28 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

liked a dataset about 1 month ago

nebius/SWE-rebench-openhands-trajectories

upvoted a paper about 1 month ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

View all activity

Organizations

upvoted a paper 28 days ago

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Paper • 2601.06165 • Published Jan 7 • 16

upvoted a paper about 1 month ago

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published Dec 23, 2025 • 27

upvoted a collection about 2 months ago

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano v3. • 8 items • Updated 5 days ago • 62

upvoted an article 2 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

591

upvoted an article 3 months ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

Feb 4, 2025

•

28

upvoted a paper 3 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 223

upvoted an article 3 months ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3, 2025

•

57

upvoted a paper 3 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 123

upvoted a collection 4 months ago

KORMo pretraining datasets

The pretraining datasets for KORMo-10B were collected from diverse, publicly available source. • 14 items • Updated Oct 13, 2025 • 21

upvoted 2 papers 4 months ago

KORMo: Korean Open Reasoning Model for Everyone

Paper • 2510.09426 • Published Oct 10, 2025 • 86

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 27

upvoted a collection 4 months ago

Qwen3-VL

37 items • Updated Dec 31, 2025 • 621

upvoted an article 4 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

135

upvoted an article 5 months ago

Article

mmBERT: ModernBERT goes Multilingual

+4

Sep 9, 2025

•

133

upvoted a collection 5 months ago

[Dataset] FineWeb2 Edu Korean

5 items • Updated Jul 24, 2025 • 2

upvoted an article 5 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

+3

Aug 8, 2025

•

92

upvoted a collection 6 months ago

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs. • 6 items • Updated Dec 23, 2025 • 8

upvoted 2 papers 8 months ago

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17, 2025 • 46

Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions

Paper • 2506.00421 • Published May 31, 2025 • 5

upvoted a collection 10 months ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.64k