Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- athirdpath/Merge_Glue
|
5 |
+
---
|
6 |
+
|
7 |
+
### TeeZee/NEBULA-XB-v1.0_SFT_2_epoch ###
|
8 |
+
|
9 |
+
Experiment, can DUS be taken one or more steps further?
|
10 |
+
|
11 |
+
|
12 |
+
### Technical notes:
|
13 |
+
- pretrained model NEBULA-XB-v1.0 finetuned on 30k entries from Merge_Glue dataset
|
14 |
+
- 18 layers removed from both models of finetuned GALAXY-XB-v03
|
15 |
+
- model has 108 layers (((48-12)*2)-18)*2 = 108
|
16 |
+
- second step in scaling DUS procedure
|
17 |
+
|
18 |
+
|
19 |
+
### To evaluate
|
20 |
+
- model performance after merge, should be a little lover that GALAXY finetuned on 50k of slimorca
|