Jiunsong/supergemma4-26b-uncensored-gguf-v2 Text Generation • 25B • Updated about 1 month ago • 288k • 547
Jiunsong/supergemma4-26b-abliterated-multimodal-mlx-4bit Image-Text-to-Text • 5B • Updated 24 days ago • 9.08k • 51
MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference Paper • 2605.07363 • Published 5 days ago • 12
Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key Paper • 2605.06638 • Published 6 days ago • 13
AcademiClaw: When Students Set Challenges for AI Agents Paper • 2605.02661 • Published 9 days ago • 15
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper • 2605.05204 • Published 7 days ago • 25
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 6 days ago • 37
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published 5 days ago • 57
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 6 days ago • 92
Flow-OPD: On-Policy Distillation for Flow Matching Models Paper • 2605.08063 • Published 5 days ago • 82
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 9 days ago • 107
Heterogeneous Scientific Foundation Model Collaboration Paper • 2604.27351 • Published 13 days ago • 211
ibm-granite/granite-speech-4.1-2b Automatic Speech Recognition • 2B • Updated 13 days ago • 138k • 90
Running 139 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 139 Building and scaling RL environments for LLM training