13 14 1

Xiao Liu

lx865712528

https://xiaoliunlc.github.io/

AI & ML interests

NLP, LLM and reasoning

Recent Activity

authored a paper 7 days ago

Sigma-Moe-Tiny Technical Report

authored a paper 7 days ago

SIGMA: An AI-Empowered Training Stack on Early-Life Hardware

upvoted a paper 24 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

View all activity

Organizations

authored 2 papers 7 days ago

Sigma-Moe-Tiny Technical Report

Paper • 2512.16248 • Published 11 days ago • 1

SIGMA: An AI-Empowered Training Stack on Early-Life Hardware

Paper • 2512.13488 • Published 14 days ago

upvoted a paper 24 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published 25 days ago • 149

authored a paper about 2 months ago

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

Paper • 2510.25804 • Published Oct 29 • 1

commented a paper about 2 months ago

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

Paper • 2510.25804 • Published Oct 29 • 1 •

upvoted a paper about 2 months ago

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

Paper • 2510.25804 • Published Oct 29 • 1

upvoted a paper 2 months ago

Knocking-Heads Attention

Paper • 2510.23052 • Published Oct 27 • 29

authored a paper 2 months ago

Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection

Paper • 2510.18909 • Published Oct 21 • 4

upvoted a paper 2 months ago

Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection

Paper • 2510.18909 • Published Oct 21 • 4

commented a paper 2 months ago

Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection

Paper • 2510.18909 • Published Oct 21 • 4 •

authored a paper 3 months ago

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Paper • 2510.08008 • Published Oct 9 • 5

upvoted a paper 3 months ago

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Paper • 2510.08008 • Published Oct 9 • 5

authored a paper 3 months ago

Behind RoPE: How Does Causal Mask Encode Positional Information?

Paper • 2509.21042 • Published Sep 25 • 8

upvoted a paper 3 months ago

Behind RoPE: How Does Causal Mask Encode Positional Information?

Paper • 2509.21042 • Published Sep 25 • 8

commented a paper 3 months ago

Behind RoPE: How Does Causal Mask Encode Positional Information?

Paper • 2509.21042 • Published Sep 25 • 8 •

authored a paper 5 months ago

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

Paper • 2507.15640 • Published Jul 21 • 4

upvoted a paper 5 months ago

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

Paper • 2507.15640 • Published Jul 21 • 4

commented a paper 5 months ago

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

Paper • 2507.15640 • Published Jul 21 • 4 •

upvoted a paper 7 months ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Paper • 2506.02678 • Published Jun 3 • 5

commented a paper 7 months ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Paper • 2506.02678 • Published Jun 3 • 5 •

Xiao Liu

AI & ML interests

Recent Activity

Organizations

lx865712528's activity