Phú Võ's picture

76 11

Phú Võ

phuvo

·

phuvo

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

LongRoPE2: Near-Lossless LLM Context Window Scaling

upvoted a paper 23 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

upvoted a paper 29 days ago

NoLiMa: Long-Context Evaluation Beyond Literal Matching

View all activity

Organizations

None yet

phuvo's activity

upvoted a paper 14 days ago

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published 14 days ago • 31

upvoted a paper 23 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 25 days ago • 142

upvoted a paper 29 days ago

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Paper • 2502.05167 • Published Feb 7 • 15

upvoted 5 papers 5 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 146

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

Paper • 2409.18943 • Published Sep 27, 2024 • 29

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Paper • 2409.17066 • Published Sep 25, 2024 • 28

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94

upvoted a paper 6 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

liked a model 7 months ago

Sao10K/L3.1-70B-Euryale-v2.2

Updated Aug 25, 2024 • 85 • 60

upvoted 3 papers 7 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 66

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 113

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 114

upvoted 6 papers 8 months ago

Scalify: scale propagation for efficient low-precision LLM training

Paper • 2407.17353 • Published Jul 24, 2024 • 13

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18, 2024 • 20

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Paper • 2407.11062 • Published Jul 10, 2024 • 8

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 78

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 161

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17, 2024 • 54

liked a model 9 months ago

Sao10K/Fimbulvetr-11B-v2.1-16K

Text Generation • Updated Jun 29, 2024 • 129 • 17