samhog commited on
Commit
2b48317
1 Parent(s): ad23529

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -9,6 +9,8 @@ This is a LLaMA-7B language model trained on 10.000 psychology-related prompts a
9
  ### Background
10
  This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
11
 
 
 
12
 
13
  **Authors:**
14
  Samuel Höglund, [email protected];
 
9
  ### Background
10
  This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
11
 
12
+ **Links**: [RLHF model](https://huggingface.co/samhog/psychology-llama-rlhf); [RLAIF model](https://huggingface.co/samhog/psychology-llama-rlaif)
13
+
14
 
15
  **Authors:**
16
  Samuel Höglund, [email protected];