Update README.md
Browse files
README.md
CHANGED
@@ -125,6 +125,7 @@ Please see the graph below:
|
|
125 |
|
126 |
The final evaluation cross-entropy ended around 0.4 for this model.
|
127 |
|
|
|
128 |
|
129 |
|
130 |
| | Loss on Llama 3.1 fine tuning | Notice |
|
@@ -132,7 +133,7 @@ The final evaluation cross-entropy ended around 0.4 for this model.
|
|
132 |
| **LORA** | 0.4603 | |
|
133 |
| **LORA+** | 0.4011 | The model uploaded here |
|
134 |
| **DORA**| 0.4182 | |
|
135 |
-
| **qLORA (for 70b model)**| 0.3694 | The model with best evaluation, was too big to optimize it further with
|
136 |
| **qLORA (for 8b model)**| 0.5471 | |
|
137 |
| **(LO)ReFT**| 0.4824 | |
|
138 |
|
|
|
125 |
|
126 |
The final evaluation cross-entropy ended around 0.4 for this model.
|
127 |
|
128 |
+
The table below shows the cross-entropies for each technique applied when the embedding training was present. Without the embedding, the results were usually worse for up to 0.1.
|
129 |
|
130 |
|
131 |
| | Loss on Llama 3.1 fine tuning | Notice |
|
|
|
133 |
| **LORA** | 0.4603 | |
|
134 |
| **LORA+** | 0.4011 | The model uploaded here |
|
135 |
| **DORA**| 0.4182 | |
|
136 |
+
| **qLORA (for 70b model)**| 0.3694 | The model with best evaluation, was too big to optimize it further with my budget|
|
137 |
| **qLORA (for 8b model)**| 0.5471 | |
|
138 |
| **(LO)ReFT**| 0.4824 | |
|
139 |
|