Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,8 @@ This is a LLaMA-7B language model trained on 10.000 psychology-related prompts a
|
|
9 |
### Background
|
10 |
This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
|
11 |
|
|
|
|
|
12 |
|
13 |
**Authors:**
|
14 |
Samuel Höglund, [email protected];
|
|
|
9 |
### Background
|
10 |
This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
|
11 |
|
12 |
+
**Links**: [RLHF model](https://huggingface.co/samhog/psychology-llama-rlhf); [RLAIF model](https://huggingface.co/samhog/psychology-llama-rlaif)
|
13 |
+
|
14 |
|
15 |
**Authors:**
|
16 |
Samuel Höglund, [email protected];
|