samaffolter
's Collections
AugmentedLearning
updated
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
Data Selection in Instruction Tuning
Paper
•
2312.15685
•
Published
•
17
mistralai/Mixtral-8x7B-Instruct-v0.1
Text Generation
•
Updated
•
538k
•
•
4.2k
microsoft/phi-2
Text Generation
•
Updated
•
225k
•
3.24k
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Text Generation
•
Updated
•
1.31M
•
1.09k
Are Emergent Abilities in Large Language Models just In-Context
Learning?
Paper
•
2309.01809
•
Published
•
3
Commonsense Knowledge Transfer for Pre-trained Language Models
Paper
•
2306.02388
•
Published
•
1
Schema-learning and rebinding as mechanisms of in-context learning and
emergence
Paper
•
2307.01201
•
Published
•
2
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Paper
•
2305.01610
•
Published
•
2
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable
Mixture-of-Expert Inference
Paper
•
2308.12066
•
Published
•
4
Experts Weights Averaging: A New General Training Scheme for Vision
Transformers
Paper
•
2308.06093
•
Published
•
2
Multi-Head Adapter Routing for Cross-Task Generalization
Paper
•
2211.03831
•
Published
•
2
Alternating Gradient Descent and Mixture-of-Experts for Integrated
Multimodal Perception
Paper
•
2305.06324
•
Published
•
1
Multimodal Foundation Models: From Specialists to General-Purpose
Assistants
Paper
•
2309.10020
•
Published
•
40
MIMIC-IT: Multi-Modal In-Context Instruction Tuning
Paper
•
2306.05425
•
Published
•
11
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models
Paper
•
2309.04041
•
Published
•
1
From Sparse to Soft Mixtures of Experts
Paper
•
2308.00951
•
Published
•
20