3 8

Fukang Wen

smallkang2025

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

liked a Space 5 days ago

HuggingFaceFW/blogpost-fine-tasks

liked a Space 5 days ago

HuggingFaceFW/FinePDFsBlog

View all activity

Organizations

None yet

upvoted a paper 4 days ago

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

Paper • 2605.14333 • Published 8 days ago • 32

liked 4 Spaces 5 days ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

FinePDFs: Liberating 3T of the finest tokens from PDFs

📄

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

236

Explore synthetic data experiments on a virtual bookshelf

Evaluation Guidebook

📝

321

Explore LLM benchmark trends over time

upvoted a paper 7 days ago

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published 9 days ago • 96

upvoted a paper 4 months ago

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 53

liked a Space 5 months ago

AI Deadlines

⚡

753

View upcoming AI conference deadlines in one place

liked 3 Spaces 7 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.35k

Explore and download the FineWeb web‑text dataset

The Ultra-Scale Playbook

🌌

3.85k

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

📚

3.18k

The secrets to building world-class LLMs