js's picture

js

rldy

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

upvoted a paper 19 days ago

Transformers without Normalization

liked a Space 24 days ago

smolagents/smolagents-leaderboard

View all activity

Organizations

rldy's activity

upvoted a paper 12 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published 17 days ago • 112

upvoted a paper 19 days ago

Transformers without Normalization

Paper • 2503.10622 • Published 22 days ago • 150

upvoted 4 papers about 1 month ago

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Paper • 2502.19414 • Published Feb 26 • 20

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24 • 71

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20 • 47

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 68

upvoted 14 papers about 2 months ago

CRANE: Reasoning with constrained LLM generation

Paper • 2502.09061 • Published Feb 13 • 19

ReLearn: Unlearning via Learning for Large Language Models

Paper • 2502.11190 • Published Feb 16 • 29

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

Paper • 2502.12115 • Published Feb 17 • 43

Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published Feb 14 • 17

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 38

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 48

LM2: Large Memory Models

Paper • 2502.06049 • Published Feb 9 • 30

Teaching Language Models to Critique via Reinforcement Learning

Paper • 2502.03492 • Published Feb 5 • 24

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11 • 38

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published Feb 10 • 21

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published Feb 7 • 43

Matryoshka Quantization

Paper • 2502.06786 • Published Feb 10 • 30

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 148

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 60