Text Generation
NeMo
English
nvidia
steerlm
llama2
reward model
zhilinw commited on
Commit
a152b1d
1 Parent(s): 97a3289

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -42,7 +42,8 @@ HelpSteer Paper : [HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM](h
42
 
43
  SteerLM Paper: [SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF](https://arxiv.org/abs/2310.05344)
44
 
45
- Llama2-13B-SteerLM-RM is trained with NVIDIA NeMo, an end-to-end, cloud-native framework to build, customize, and deploy generative AI models anywhere. It includes training and inferencing frameworks, guardrailing toolkits, data curation tools, and pretrained models, offering enterprises an easy, cost-effective, and fast way to adopt generative AI.
 
46
 
47
  ## Usage:
48
 
 
42
 
43
  SteerLM Paper: [SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF](https://arxiv.org/abs/2310.05344)
44
 
45
+ Llama2-13B-SteerLM-RM is trained with NVIDIA [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner), a scalable toolkit for performant and efficient model alignment. NeMo-Aligner is built using the [NeMo Framework](https://github.com/NVIDIA/NeMo) which allows for scaling training up to 1000s of GPUs using tensor, data and pipeline parallelism for all components of alignment. All of our checkpoints are cross compatible with the NeMo ecosystem, allowing for inference deployment and further customization.
46
+
47
 
48
  ## Usage:
49