FsfairX

community

AI & ML interests

None defined yet.

Recent Activity

hendrydong authored a paper 2 days ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

hendrydong authored a paper about 1 month ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

hendrydong updated a model 4 months ago

sfairXC/FsfairX-LLaMA3-RM-v0.1

View all activity

sfairXC's activity

hendrydong

authored a paper 2 days ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 5 days ago • 30

hendrydong

authored a paper about 1 month ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38

hendrydong

updated a model 4 months ago

sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • Updated Oct 14, 2024 • 5.7k • 54

hendrydong

authored a paper 4 months ago

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Paper • 2410.04698 • Published Oct 7, 2024 • 13

hendrydong

updated 4 models 5 months ago

sfairXC/llama-3.1-sft-2ep

Text Generation • Updated Sep 18, 2024 • 1

sfairXC/llama-3.1-sft-1ep

Text Generation • Updated Sep 18, 2024 • 3

sfairXC/gemma-sft-2ep

Text Generation • Updated Aug 30, 2024 • 92

sfairXC/gemma-sft-1ep

Text Generation • Updated Aug 30, 2024 • 89

hendrydong

authored a paper 6 months ago

ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 31

hendrydong

updated a model 7 months ago

sfairXC/FsfairX-Gemma2-RM-v0.1

Text Classification • Updated Jul 9, 2024 • 7.89k • 7

hendrydong

authored 8 papers 9 months ago

Reverse Diffusion Monte Carlo

Paper • 2307.02037 • Published Jul 5, 2023 • 1

Spurious Feature Diversification Improves Out-of-distribution Generalization

Paper • 2309.17230 • Published Sep 29, 2023

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

Local Augmentation for Graph Neural Networks

Paper • 2109.03856 • Published Sep 8, 2021

Weakly Supervised Disentangled Generative Causal Representation Learning

Paper • 2010.02637 • Published Oct 6, 2020

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2

DetGPT: Detect What You Need via Reasoning

Paper • 2305.14167 • Published May 23, 2023

bpucla

authored a paper 9 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 67

hendrydong

authored a paper 9 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 67