Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 8 days ago • 59
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation Paper • 2503.21729 • Published 12 days ago • 27
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 14 days ago • 104
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published 14 days ago • 15
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 19 days ago • 46
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 19 days ago • 67
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 21 days ago • 115
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 28 days ago • 378
Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 26 days ago • 11
Hallucination detection Collection Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated Mar 5 • 15
Rank1: Test-Time Compute for Reranking in Information Retrieval Paper • 2502.18418 • Published Feb 25 • 26
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Paper • 2502.16894 • Published Feb 24 • 28
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 73
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published Feb 10 • 131
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback Paper • 2502.15027 • Published Feb 20 • 7