NeMo
PyTorch
English
text generation
causal-lm
arham19 commited on
Commit
55faaab
·
1 Parent(s): 173321b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -78,16 +78,16 @@ SteerLM Llama-2 is a 13 billion parameter generative language model based on the
78
 
79
  Key capabilities enabled by SteerLM:
80
 
81
- - Dynamic steering of responses by specifying desired attributes like quality, helpfulness, and toxicity
82
- - Simplified training compared to RLHF techniques like fine-tuning and bootstrapping
83
 
84
  ## Model Architecture and Training
85
  The SteerLM method involves the following key steps:
86
 
87
- 1. Train an attribute prediction model on human annotated data to evaluate response quality
88
- 2. Use this model to annotate diverse datasets and enrich training data
89
- 3. Perform conditioned fine-tuning to align responses with specified combinations of attributes
90
- 4. (Optionally) Bootstrap training through model sampling and further fine-tuning
91
 
92
  SteerLM Llama-2 applies this technique on top of the Llama-2 architecture. It was pretrained on internet-scale data and then customized using [OASST](https://huggingface.co/datasets/OpenAssistant/oasst1) and [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) data.
93
 
@@ -109,7 +109,7 @@ pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp
109
  pip install nemo_toolkit['nlp']==1.17.0
110
  ```
111
 
112
- Alternatively, you can use NeMo Megatron training docker container with all dependencies pre-installed.
113
 
114
  2. Launch eval server
115
 
@@ -199,7 +199,7 @@ The model was trained on the data originally crawled from the Internet. This dat
199
  We did not perform any bias/toxicity removal or model alignment on this checkpoint.
200
 
201
 
202
- ## Licence
203
 
204
  - Llama 2 is licensed under the [LLAMA 2 Community License](https://ai.meta.com/llama/license/), Copyright © Meta Platforms, Inc. All Rights Reserved.
205
  - Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the [Acceptable Use Policy](https://ai.meta.com/llama/use-policy) for the Llama Materials.
 
78
 
79
  Key capabilities enabled by SteerLM:
80
 
81
+ - Dynamic steering of responses by specifying desired attributes like quality, helpfulness, and toxicity.
82
+ - Simplified training compared to RLHF techniques like fine-tuning and bootstrapping.
83
 
84
  ## Model Architecture and Training
85
  The SteerLM method involves the following key steps:
86
 
87
+ 1. Train an attribute prediction model on human annotated data to evaluate response quality.
88
+ 2. Use this model to annotate diverse datasets and enrich training data.
89
+ 3. Perform conditioned fine-tuning to align responses with specified combinations of attributes.
90
+ 4. (Optionally) Bootstrap training through model sampling and further fine-tuning.
91
 
92
  SteerLM Llama-2 applies this technique on top of the Llama-2 architecture. It was pretrained on internet-scale data and then customized using [OASST](https://huggingface.co/datasets/OpenAssistant/oasst1) and [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) data.
93
 
 
109
  pip install nemo_toolkit['nlp']==1.17.0
110
  ```
111
 
112
+ Alternatively, you can use NeMo Framework.
113
 
114
  2. Launch eval server
115
 
 
199
  We did not perform any bias/toxicity removal or model alignment on this checkpoint.
200
 
201
 
202
+ ## License
203
 
204
  - Llama 2 is licensed under the [LLAMA 2 Community License](https://ai.meta.com/llama/license/), Copyright © Meta Platforms, Inc. All Rights Reserved.
205
  - Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the [Acceptable Use Policy](https://ai.meta.com/llama/use-policy) for the Llama Materials.