Quentin Gallouédec's picture

Quentin Gallouédec PRO

qgallouedec

·

AI & ML interests

None yet

Recent Activity

updated a model 10 days ago

trl-internal-testing/tiny-Llama4ForCausalLM

published a model 10 days ago

trl-internal-testing/tiny-Llama4ForCausalLM

updated a model 11 days ago

qgallouedec/Qwen-2.5-7B-Simple-RL

View all activity

Organizations

qgallouedec's activity

upvoted a paper 11 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 119

upvoted a paper 24 days ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published 25 days ago • 29

upvoted a collection about 1 month ago

Gemma 3 Release

24 items • Updated about 8 hours ago • 333

upvoted a paper about 1 month ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 8

upvoted an article about 1 month ago

Article

The N Implementation Details of RLHF with PPO

Oct 24, 2023

• 49

upvoted 3 papers about 2 months ago

ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

Paper • 1910.02054 • Published Oct 4, 2019 • 6

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 115

Presumed Cultural Identity: How Names Shape LLM Responses

Paper • 2502.11995 • Published Feb 17 • 10

upvoted a collection 2 months ago

DeepSeek-R1

8 items • Updated Jan 21 • 606

upvoted 2 papers 3 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 382

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 115

upvoted 2 papers 4 months ago

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 12

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 9

upvoted a collection 5 months ago

Tiny models

23 items • Updated Nov 30, 2024 • 1

upvoted a paper 5 months ago

QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 52