IlyaGusev commited on
Commit
55a2da9
1 Parent(s): 0af818c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -14,6 +14,7 @@ datasets:
14
 
15
  Based on [Llama-3 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
16
 
 
17
  ChatML prompt format:
18
  ```
19
  <|im_start|>system
@@ -49,4 +50,18 @@ v2:
49
  - dataset code revision d0d123dd221e10bb2a3383bcb1c6e4efe1b4a28a
50
  - wandb [link](https://wandb.ai/ilyagusev/huggingface/runs/r6u5juyk)
51
  - 5 datasets: ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch, gpt_roleplay_realm, ru_instruct_gpt4
52
- - Datasets merging script: [create_short_chat_set.py](https://github.com/IlyaGusev/rulm/blob/d0d123dd221e10bb2a3383bcb1c6e4efe1b4a28a/self_instruct/src/data_processing/create_short_chat_set.py)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  Based on [Llama-3 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
16
 
17
+
18
  ChatML prompt format:
19
  ```
20
  <|im_start|>system
 
50
  - dataset code revision d0d123dd221e10bb2a3383bcb1c6e4efe1b4a28a
51
  - wandb [link](https://wandb.ai/ilyagusev/huggingface/runs/r6u5juyk)
52
  - 5 datasets: ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch, gpt_roleplay_realm, ru_instruct_gpt4
53
+ - Datasets merging script: [create_short_chat_set.py](https://github.com/IlyaGusev/rulm/blob/d0d123dd221e10bb2a3383bcb1c6e4efe1b4a28a/self_instruct/src/data_processing/create_short_chat_set.py)
54
+
55
+
56
+ # Evaluation
57
+
58
+ * Dataset: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/data/tasks.jsonl
59
+ * Framework: https://github.com/tatsu-lab/alpaca_eval
60
+ * Evaluator: alpaca_eval_cot_gpt4_turbo_fn
61
+
62
+ | model | length_controlled_winrate | win_rate | standard_error | avg_length |
63
+ |-----|-----|-----|-----|-----|
64
+ |chatgpt_4_turbo | 76.04 | 90.00 |1.46 | 1270 |
65
+ |chatgpt_3_5_turbo | 50.00 | 50.00 | 0.00 | 536 |
66
+ |saiga_llama3_8b | 33.07 | 48.19 | 2.45 | 1166 |
67
+ saiga_mistral_7b | 23.38 | 35.99 | 2.34 | 949 |