-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 183 -
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper • 2403.02884 • Published • 15
peng
superpeng
·
AI & ML interests
None yet
Recent Activity
liked
a dataset
about 18 hours ago
Krystalan/xmediasum
upvoted
a
paper
about 18 hours ago
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
upvoted
an
article
22 days ago
Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging
Organizations
None yet
Collections
5
-
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 19 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 26 -
LoRA Learns Less and Forgets Less
Paper • 2405.09673 • Published • 87 -
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Paper • 2405.17428 • Published • 17
models
None public yet
datasets
None public yet