Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published 10 days ago • 46
MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published 2 days ago • 9
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Paper • 2410.16256 • Published 3 days ago • 53