Kaiwen Wang
kaiwenw
·
AI & ML interests
Reinforcement Learning
Recent Activity
Organizations
kaiwenw/nov11_oasst_pref_jdpo_gpt4o_3_judges
Viewer
•
Updated
•
14.7k
•
31
kaiwenw/nov11_oasst_pref_jdpo_llama70b_cot
Viewer
•
Updated
•
2.68k
•
29
kaiwenw/nov11_oasst_pref_jdpo_llama70b_cot_11_judges
Viewer
•
Updated
•
14.7k
•
31
kaiwenw/nov11_oasst_mini_pref_jdpo_llama8b_cot
Viewer
•
Updated
•
525
•
33
kaiwenw/nov11_oasst_mini_pref_jdpo_llama8b_cot_8_judges
Viewer
•
Updated
•
790
•
32
kaiwenw/oasst_pref_jdpo_llama70b_cot
Viewer
•
Updated
•
3.35k
•
33
kaiwenw/oasst_pref_jdpo_llama70b_cot_12_judges
Viewer
•
Updated
•
14.7k
•
29
kaiwenw/oasst_pref_jdpo_llama8b_cot_Meta-Llama-3.1-8B-Instruct_5_judges
Viewer
•
Updated
•
14.7k
•
30
kaiwenw/oasst_mini_pref_jdpo_llama70b_cot_Meta-Llama-3.1-70B-Instruct_3_judges
Viewer
•
Updated
•
80
•
30
kaiwenw/nov6_oasst_jdpo_llama70b
Viewer
•
Updated
•
10.6k
•
28
kaiwenw/oasst_Meta-Llama-3.1-70B-Instruct_3_judges
Viewer
•
Updated
•
7.37k
•
30
kaiwenw/nov6_oasst_jdpo_llama8b
Viewer
•
Updated
•
11.2k
•
31
kaiwenw/oasst_Meta-Llama-3.1-8B-Instruct_3_judges
Viewer
•
Updated
•
7.37k
•
34
kaiwenw/nov5_sp1_jdpo_gap_0.25
Viewer
•
Updated
•
6.68k
•
28
kaiwenw/nov5_sp1_oct31_oasst_llama70b_jft_3_judges
Viewer
•
Updated
•
3.64k
•
30
kaiwenw/nov6_oasst_mini_jdpo_llama8b_unflatten
Viewer
•
Updated
•
25
•
30
kaiwenw/nov6_oasst_mini_jdpo_llama8b
Viewer
•
Updated
•
50
•
33
kaiwenw/oasst_mini_Meta-Llama-3.1-8B-Instruct_3_judges
Viewer
•
Updated
•
40
•
29
kaiwenw/nov6_oasst_mini_jdpo_llama70b_unflatten
Viewer
•
Updated
•
14
•
29
kaiwenw/nov6_oasst_mini_jdpo_llama70b
Viewer
•
Updated
•
28
•
31
kaiwenw/nov5_sp1_jft_gap_0.25
Viewer
•
Updated
•
1.91k
•
29
Viewer
•
Updated
•
3.64k
•
29
kaiwenw/nov2_aft_gpt4o_1.1
Viewer
•
Updated
•
3.59k
•
33
kaiwenw/nov2_aft_gpt4o_1.0
Viewer
•
Updated
•
3.38k
•
34
kaiwenw/nov2_aft_gpt4o_0.9
Viewer
•
Updated
•
3.05k
•
32
kaiwenw/nov2_aft_llama70b_1.1
Viewer
•
Updated
•
3.63k
•
30
kaiwenw/nov2_aft_llama70b_1.0
Viewer
•
Updated
•
3.5k
•
33
kaiwenw/nov2_aft_llama70b_0.9
Viewer
•
Updated
•
3.37k
•
28
Viewer
•
Updated
•
200
•
28
Viewer
•
Updated
•
3k
•
31