CCMat
's Collections
LLMs
updated
TinyLlama: An Open-Source Small Language Model
Paper
•
2401.02385
•
Published
•
90
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper
•
2401.13601
•
Published
•
45
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper
•
2401.15024
•
Published
•
69
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language
Modeling
Paper
•
2401.16380
•
Published
•
48
Weaver: Foundation Models for Creative Writing
Paper
•
2401.17268
•
Published
•
43
Dolma: an Open Corpus of Three Trillion Tokens for Language Model
Pretraining Research
Paper
•
2402.00159
•
Published
•
61
BlackMamba: Mixture of Experts for State-Space Models
Paper
•
2402.01771
•
Published
•
23
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
104
Nomic Embed: Training a Reproducible Long Context Text Embedder
Paper
•
2402.01613
•
Published
•
14
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
82
SPAR: Personalized Content-Based Recommendation via Long Engagement
Attention
Paper
•
2402.10555
•
Published
•
34
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
114
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
606
Gemini 1.5: Unlocking multimodal understanding across millions of tokens
of context
Paper
•
2403.05530
•
Published
•
61
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
125
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
65
RecurrentGemma: Moving Past Transformers for Efficient Open Language
Models
Paper
•
2404.07839
•
Published
•
43
Understanding the planning of LLM agents: A survey
Paper
•
2402.02716
•
Published
•
1
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper
•
2201.11903
•
Published
•
9
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Paper
•
2303.17580
•
Published
•
10
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
126
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
•
2405.01535
•
Published
•
120
Better & Faster Large Language Models via Multi-token Prediction
Paper
•
2404.19737
•
Published
•
73
Octopus v4: Graph of language models
Paper
•
2404.19296
•
Published
•
116
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output
Paper
•
2407.03320
•
Published
•
93
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
254
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
•
2405.00732
•
Published
•
119