HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Paper • 2410.10812 • Published 10 days ago • 11
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published 23 days ago • 139
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 30