Unseen Papers - a Julius-L Collection

Julius-L 's Collections

multimodal dataset

Memory Efficient Training

Model Architecture

LLM Technical Reports

Unseen Papers

updated Nov 1, 2024

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 14
LOGO -- Long cOntext aliGnment via efficient preference Optimization

Paper • 2410.18533 • Published Oct 24, 2024 • 42
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89
LongReward: Improving Long-context Large Language Models with AI Feedback

Paper • 2410.21252 • Published Oct 28, 2024 • 17
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19