Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,18 @@ pipeline_tag: text-generation
|
|
30 |
# SpydazWeb AGI
|
31 |
( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
## SpydazWeb AI model :
|
34 |
|
35 |
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
|
|
|
30 |
# SpydazWeb AGI
|
31 |
( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)
|
32 |
|
33 |
+
## Training Note :
|
34 |
+
This is the base FP16 model ! ( very hard to get out! i had to use transformers only and NOT unsloth !):
|
35 |
+
to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model)
|
36 |
+
I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 :
|
37 |
+
### REASON :
|
38 |
+
Unsloth issues : they load thier own model if your loading a 16bit model ...
|
39 |
+
and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !?
|
40 |
+
you think.... WHAT? they download your own tensors but use another model ??
|
41 |
+
yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ?
|
42 |
+
SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine !
|
43 |
+
|
44 |
+
|
45 |
## SpydazWeb AI model :
|
46 |
|
47 |
This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
|