Rethinking Diverse Human Preference Learning through Principal Component Analysis Paper • 2502.13131 • Published Feb 18 • 36
MaxwellJryao/sft_loraMoE_wiki_hop_original_choose_best_object_affirmative_1-lora-sft_Qwen2-1.5B_lr-1e-3 Updated Sep 5, 2024
Post-training-Data-Flywheel/NousResearch-hermes-function-calling-v1 Viewer • Updated Aug 30, 2024 • 1.89k • 38
Post-training-Data-Flywheel/glaiveai-glaive-function-calling-v2 Viewer • Updated Aug 23, 2024 • 75.2k • 53 • 1
Post-training-Data-Flywheel/ise-uiuc-Magicoder-OSS-Instruct-75K Viewer • Updated Aug 23, 2024 • 75.2k • 33
Post-training-Data-Flywheel/Salesforce-xlam-function-calling-60k Viewer • Updated Aug 23, 2024 • 60k • 36
Post-training-Data-Flywheel/RLHFlow-CodeUltraFeedback-standard Viewer • Updated Aug 23, 2024 • 38.4k • 34 • 1
MaxwellJryao/sft_wiki_hop_original_choose_best_object_affirmative_1-lora-sft_Qwen2-1.5B_lr-1e-3 Updated Aug 12, 2024
MaxwellJryao/sft_imdb_Reviewer_Opinion_bad_good_choices-lora-sft_Qwen2-1.5B_lr-1e-3 Updated Aug 12, 2024
MaxwellJryao/sft_super_glue_boolq_yes_no_question-lora-sft_Qwen2-1.5B_lr-1e-3 Updated Aug 12, 2024 • 5