Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 3 days ago • 36
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 4 days ago • 36
Zephyr 7B Gemma Collection Models, dataset, and Demo for Zephyr 7B Gemma. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 5 items • Updated Apr 12 • 15
AppAgent: Multimodal Agents as Smartphone Users Paper • 2312.13771 • Published Dec 21, 2023 • 52
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition Paper • 2307.13269 • Published Jul 25, 2023 • 31
ARB: Advanced Reasoning Benchmark for Large Language Models Paper • 2307.13692 • Published Jul 25, 2023 • 16
Towards a Unified View of Parameter-Efficient Transfer Learning Paper • 2110.04366 • Published Oct 8, 2021 • 2
Composing Parameter-Efficient Modules with Arithmetic Operations Paper • 2306.14870 • Published Jun 26, 2023 • 3