arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Recent Activity
updated
a dataset
about 7 hours ago
qwselfcorr/math_augmath_starplus_tmp10_turn2
updated
a dataset
about 7 hours ago
qwselfcorr/math_augmath_starplus_tmp07_turn2
published
a dataset
about 7 hours ago
qwselfcorr/math_augmath_starplus_tmp07_turn2
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
6
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
48
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
2
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
3
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
4
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
2
datasets
177
weqweasdas/rs_numia30k
Viewer
•
Updated
•
30.6k
•
11
weqweasdas/rs_math_train
Viewer
•
Updated
•
7.5k
•
16
weqweasdas/rs_math_test
Viewer
•
Updated
•
5k
•
33
weqweasdas/rs_gsm8k_test
Viewer
•
Updated
•
1.32k
•
10
weqweasdas/rs_gsm8k_train
Viewer
•
Updated
•
7.47k
•
13
weqweasdas/ace_processed
Viewer
•
Updated
•
5.18M
•
27
weqweasdas/llama31_70b_chosen_type12_mix
Viewer
•
Updated
•
21.5k
•
21
weqweasdas/prompt_math_test
Viewer
•
Updated
•
15k
•
25
weqweasdas/fixed05_llasft_math_7ktype2_7ktype3_ver2_150_tmp10_generation_with_rewards
Viewer
•
Updated
•
30k
•
36
weqweasdas/filtered_numia_prompt15k
Viewer
•
Updated
•
15k
•
23