Shihan Dou's picture

Shihan Dou

Ablustrund

·

Ablustrund

AI & ML interests

Natural Language Processing, Large Language Models

Recent Activity

upvoted a paper 28 days ago

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

upvoted an article 10 months ago

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

commented on a paper about 1 year ago

Secrets of RLHF in Large Language Models Part II: Reward Modeling

View all activity

Organizations

Ablustrund's activity

upvoted a paper 28 days ago

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Paper • 2503.06053 • Published Mar 8 • 136

upvoted an article 10 months ago

Article

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Jun 18, 2024

• 46

commented 2 papers about 1 year ago

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 29 •

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Paper • 2307.15217 • Published Jul 27, 2023 • 38 •

upvoted a paper about 1 year ago

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 29

commented a paper about 1 year ago

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2, 2024 • 44 •

upvoted a paper about 1 year ago

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2, 2024 • 44

authored 2 papers about 1 year ago

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2, 2024 • 44

MouSi: Poly-Visual-Expert Vision-Language Models

Paper • 2401.17221 • Published Jan 30, 2024 • 9

authored a paper over 1 year ago

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 29

liked 4 datasets over 1 year ago

vikp/evol_instruct_v2_filtered_109k

Viewer • Updated Aug 29, 2023 • 110k • 21 • 3

mrqa-workshop/mrqa

Viewer • Updated Jan 24, 2024 • 585k • 523 • 23

lucadiliello/naturalquestionsshortqa

Viewer • Updated Jun 6, 2023 • 117k • 56 • 3

openbmb/UltraFeedback

Viewer • Updated Dec 29, 2023 • 64k • 1.68k • 357

updated a model over 1 year ago

fnlp/moss-rlhf-policy-model-7B-en

Updated Jul 17, 2023 • 2

New activity in fnlp/moss-rlhf-policy-model-7B-en over 1 year ago

Upload diff/generation_config.json with huggingface_hub

#9 opened over 1 year ago by

Upload diff/pytorch_model-00003-of-00003.bin with huggingface_hub

#8 opened over 1 year ago by

Upload diff/pytorch_model-00001-of-00003.bin with huggingface_hub

#7 opened over 1 year ago by

Upload diff/pytorch_model-00002-of-00003.bin with huggingface_hub

#6 opened over 1 year ago by

Upload diff/config.json with huggingface_hub

#5 opened over 1 year ago by