1 156 638

Motoki Wu

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

liked a Space about 3 hours ago

open-r1/README

liked a model 3 days ago

meta-llama/Llama-4-Scout-17B-16E

liked a model 3 days ago

meta-llama/Llama-4-Maverick-17B-128E-Instruct

View all activity

Organizations

tokestermw's activity

upvoted a paper 7 days ago

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 8 days ago • 59

upvoted a paper 9 days ago

ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation

Paper • 2503.21729 • Published 12 days ago • 27

upvoted a paper 10 days ago

Fully Autonomous AI Agents Should Not be Developed

Paper • 2502.02649 • Published Feb 4 • 33

upvoted an article 10 days ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

14 days ago

• 104

upvoted a paper 13 days ago

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

Paper • 2503.19470 • Published 14 days ago • 15

upvoted 2 papers 16 days ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published 19 days ago • 84

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published 19 days ago • 46

upvoted a paper 18 days ago

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published 19 days ago • 67

upvoted a paper 21 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published 21 days ago • 115

upvoted a collection 27 days ago

Gemma 3 Release

Collection

17 items • Updated 5 days ago • 317

upvoted an article 27 days ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

28 days ago

• 378

upvoted 2 collections about 1 month ago

Light-R1

Collection

Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 26 days ago • 11

Hallucination detection

Collection

Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated Mar 5 • 15

upvoted 5 papers about 1 month ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 26

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24 • 28

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 73

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Paper • 2502.15027 • Published Feb 20 • 7