SePPO: Semi-Policy Preference Optimization for Diffusion Alignment Paper • 2410.05255 • Published Oct 7 • 4
OmniBench: Towards The Future of Universal Omni-Language Models Paper • 2409.15272 • Published Sep 23 • 26
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 46
Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions Paper • 2310.18780 • Published Oct 28, 2023 • 3
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections Paper • 2206.12037 • Published Jun 24, 2022
Zoology: Measuring and Improving Recall in Efficient Language Models Paper • 2312.04927 • Published Dec 8, 2023 • 2
Simple linear attention language models balance the recall-throughput tradeoff Paper • 2402.18668 • Published Feb 28 • 18
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry Paper • 2303.02552 • Published Mar 5, 2023
PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages Paper • 2303.08934 • Published Mar 15, 2023
Challenges and Practices of Deep Learning Model Reengineering: A Case Study on Computer Vision Paper • 2303.07476 • Published Mar 13, 2023
An Experience Report on Machine Learning Reproducibility: Guidance for Practitioners and TensorFlow Model Garden Contributors Paper • 2107.00821 • Published Jul 2, 2021
Analysis of Failures and Risks in Deep Learning Model Converters: A Case Study in the ONNX Ecosystem Paper • 2303.17708 • Published Mar 30, 2023
PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software Paper • 2402.00699 • Published Feb 1 • 2