taicheng guo's picture

33 8

taicheng guo

taicheng

·

AI & ML interests

None yet

Recent Activity

liked a model 6 days ago

Qwen/QwQ-32B

upvoted a paper 14 days ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

liked a model 21 days ago

nvidia/Llama-3.1-Nemotron-70B-Reward

View all activity

Organizations

taicheng's activity

upvoted a paper 14 days ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published 16 days ago • 68

upvoted a paper 29 days ago

Teaching Language Models to Critique via Reinforcement Learning

Paper • 2502.03492 • Published Feb 5 • 24

upvoted a collection about 2 months ago

🧠 Reasoning Models

9 items • Updated 9 days ago • 37

upvoted 4 papers about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 95

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 262

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

upvoted a paper 2 months ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 50

upvoted 4 papers 3 months ago

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 73

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 92

What indeed can GPT models do in chemistry? A comprehensive benchmark on eight tasks

Paper • 2305.18365 • Published May 27, 2023 • 4

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published Nov 25, 2024 • 46

upvoted a collection 5 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17, 2024 • 15

upvoted 2 papers 5 months ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 69

MURI: High-Quality Instruction Tuning Datasets for Low-Resource Languages via Reverse Instructions

Paper • 2409.12958 • Published Sep 19, 2024 • 8

upvoted 5 papers 6 months ago

GRIN: GRadient-INformed MoE

Paper • 2409.12136 • Published Sep 18, 2024 • 16

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 62

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Paper • 2409.12903 • Published Sep 19, 2024 • 22

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 138

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 24