BuiDoan

AI & ML interests

None yet

Recent Activity

liked a model about 3 hours ago

microsoft/bitnet-b1.58-2B-4T

upvoted a paper 1 day ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

liked a model 4 days ago

ibm-granite/granite-speech-3.3-8b

View all activity

Organizations

BuiDoan's activity

upvoted a paper 1 day ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 155

upvoted a paper 5 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 7 days ago • 235

upvoted 2 papers about 1 month ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 157

SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking

Paper • 2503.00955 • Published Mar 2 • 27

upvoted a paper about 2 months ago

From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 28

upvoted a paper 2 months ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 51

upvoted 2 articles 2 months ago

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Feb 12

• 62

Article

From Files to Chunks: Improving Hugging Face Storage Efficiency

Nov 20, 2024

• 58

upvoted a collection 2 months ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated Feb 20 • 253

upvoted 2 papers 2 months ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 115

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 213

upvoted 4 papers 3 months ago

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 64

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 120

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 286

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 102

upvoted a collection 3 months ago

AI Paper of the Day

Collection

A collection of papers that I think are interesting, one added each day • 336 items • Updated 1 day ago • 41

upvoted 4 papers 4 months ago