pt-sk commited on
Commit
9e74c5b
1 Parent(s): 3c33df0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -1,7 +1,10 @@
1
  ---
2
  license: mit
3
  datasets: pt-sk/toxic_classification
4
- tags: ["PPO", "RLHF"]
 
 
 
5
  ---
6
  Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
7
  Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)
 
1
  ---
2
  license: mit
3
  datasets: pt-sk/toxic_classification
4
+ tags:
5
+ - PPO
6
+ - RLHF
7
+ pipeline_tag: text-generation
8
  ---
9
  Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
10
  Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)