TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T Text Generation • Updated Sep 27 • 354k • 164
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31
Learning Multi-Step Reasoning by Solving Arithmetic Tasks Paper • 2306.01707 • Published Jun 2, 2023 • 1
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17 • 33
TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T Text Generation • Updated Feb 3 • 7.78k • 59