bikalnetomi commited on
Commit
acafd13
·
verified ·
1 Parent(s): 1f0c0a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ licence: license
12
  # Model Card for rlhf-ppo-llama31-8B-Reward-model-lora-r64-bikal
13
 
14
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
15
- It has been trained using [TRL](https://github.com/huggingface/trl) with ultrafeedback-Binarized Dataset(trl-lib/ultrafeedback_binarized)
16
 
17
  ## Quick start
18
 
 
12
  # Model Card for rlhf-ppo-llama31-8B-Reward-model-lora-r64-bikal
13
 
14
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
15
+ It has been trained using [TRL](https://github.com/huggingface/trl) with [ultrafeedback-Binarized Dataset](trl-lib/ultrafeedback_binarized)
16
 
17
  ## Quick start
18