gupta-tanish
commited on
Update README.md
Browse filesFinetuned the base model using Direct Preference Optimization on Ultra Feedback dataset for instances having score difference >=5 b/w chosen and rejected responses
README.md
CHANGED
@@ -4,7 +4,17 @@ datasets:
|
|
4 |
- HuggingFaceH4/ultrafeedback_binarized
|
5 |
language:
|
6 |
- en
|
|
|
7 |
base_model:
|
8 |
- NousResearch/Nous-Hermes-llama-2-7b
|
|
|
9 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
---
|
|
|
4 |
- HuggingFaceH4/ultrafeedback_binarized
|
5 |
language:
|
6 |
- en
|
7 |
+
- fr
|
8 |
base_model:
|
9 |
- NousResearch/Nous-Hermes-llama-2-7b
|
10 |
+
- meta-llama/Llama-2-7b
|
11 |
pipeline_tag: text-generation
|
12 |
+
metrics:
|
13 |
+
- accuracy
|
14 |
+
- bertscore
|
15 |
+
- bleurt
|
16 |
+
- brier_score
|
17 |
+
tags:
|
18 |
+
- biology
|
19 |
+
- chemistry
|
20 |
---
|