Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,5 @@
|
|
|
|
|
|
1 |
|
2 |
# Training Metrics
|
3 |
|
|
|
1 |
+
Test network using differential attention instead of classical attention. Other than some alterations to the attention, this is otherwise the same configuration as https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct
|
2 |
+
|
3 |
|
4 |
# Training Metrics
|
5 |
|