license: apache-2.0 language: am

  • DeepSpeed-RLHF系统训练:DeepSpeed-HE 能够在 RLHF 中无缝地在推理和训练模式之间切换,使其能够利用来自 DeepSpeed-Inference 的各种优化,如张量并行计算和高性能CUDA算子进行语言生成,同时对训练部分还能从 ZeRO- 和 LoRA-based 内存优化策略中受益。DeepSpeed-HE 还能够自动在 RLHF 的不同阶段进行智能的内存管理和数据缓存。

  • Train Data:(English)--data_path Dahoas/rm-static Dahoas/full-hh-rlhf Dahoas/synthetic-instruct-gptj-pairwise yitingxie/rlhf-reward-datasets openai/webgpt_comparisons stanfordnlp/SHP

  • Train Data:(Chinese)--data_path wangrui6/Zhihu-KOL Cohere/miracl-zh-queries-22-12 Hello-SimpleAI/HC3-Chinese mkqa-Chinese

  • 可自定义actor model 和 reward model,亦可单独训练rlhf model

  • Usage:

    git clone https://github.com/microsoft/DeepSpeedExamples
    
    cd DeepSpeedExamples/applications/DeepSpeed-Chat
    
    pip install -r requirements.txt
    
    python chat.py --path Laurie/opt1.3b-deepspeed-chat
    
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using Laurie/opt1.3b-deepspeed-chat 2