File size: 9,459 Bytes
02c6cbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4aa9e7
 
b5ca2e7
 
 
 
 
 
 
 
 
 
 
 
f1b1e3a
02c6cbf
f1b1e3a
 
 
 
 
02c6cbf
 
3afb466
 
 
 
 
f1b1e3a
 
02c6cbf
 
f1b1e3a
02c6cbf
f1b1e3a
02c6cbf
f1b1e3a
02c6cbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f1b1e3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
---
license: mit
tags:
- Mistral_Star
- Mistral_Quiet
- Mistral
- Mixtral
- Question-Answer
- Token-Classification
- Sequence-Classification
- SpydazWeb-AI
- chemistry
- biology
- legal
- code
- climate
- medical
- text-generation-inference
language:
- en
- sw
- ig
- zu
- ca
- es
- pt
- ha
pipeline_tag: text-generation
---
# SpydazWeb AGI 
( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)

## Training Note  : 
This is the base FP16 model !  ( very hard to get out! i had to use transformers only and NOT unsloth !):
to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model)
I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 :
### REASON :
Unsloth issues : they load thier own model if your loading a 16bit model ... 
and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !?
you think.... WHAT? they download your own tensors but use another model ?? 
yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ? 
SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine !


## SpydazWeb AI model :

This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
A friendly interface with a personality caring and flirtatious at times : non binary !...
and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.


### After unsloth Training Warning
This model cannot be saved as 16bit curretly  - SOON ! 
but it can be saved as 4bitMerged


<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
https://github.com/spydaz


### General Intenal Methods:

Trained for multi-task operations as well as rag and function calling :

This model is a fully functioning model and is fully uncensored: 

    * 32k context window (vs 8k context in v0.1)
    * Rope-theta = 1e6
    * No Sliding-Window Attention
    * Talk heads  - produce resposnes which can be used towards the final output
    * Pre-Thoughts  - Enables for pre-generation steps of potential artifacts for task solving: 
      * Generates plans for step by step thinking 
      * Generates python Code Artifacts for future tasks
      * Recalls context for task internally to be used as refference for task:
    * show thoughts or hidden thought usages ( Simular to self-Rag )

the model has been trained on multiple datasets on the huggingface hub and kaggle :

the focus has been mainly on methodology : 

* Chain of thoughts
* step by step planning
* tree of thoughts
* forest of thoughts
* graph of thoughts
* agent generation : Voting, ranking, ... dual agent response generation:

with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :

the model has been intensivly trained in recalling data previously entered into the matrix:
The model has also been trained on rich data and markdown outputs as much as possible : 
the model can also generate markdown charts with mermaid.


## Training Reginmes:
  * Alpaca
  * ChatML / OpenAI / MistralAI
  * Text Generation
  * Question/Answer (Chat)
  * Instruction/Input/Response (instruct)
  * Mistral Standard Prompt
  * Translation Tasks
  * Entitys / Topic detection
  * Book recall
  * Coding challenges, Code Feedback, Code Sumarization, Commenting Code
  * Agent Ranking and response anyalisis
  * Medical tasks
    * PubMed
    * Diagnosis
    * Psychaitry
    * Counselling
    * Life Coaching
    * Note taking
    * Medical smiles
    * Medical Reporting
  * Virtual laboritys simulations
  * Chain of thoughts methods
  * One shot / Multi shot prompting tasks
    
This model will be a custom model with internal experts and rag systems
enabling for preprocessing of the task internally before outputting a response 

This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)

Current Update :
This model is working , AND TRAINED !!!  to load the model it requires trust-remote=TRUE:: 
But also if it does not load then you need to clone the github:

# Introduction :

## STAR REASONERS ! 

this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
Another head could also be dedicated to retrieving content  based on the query from the self which can also be used in the pregenerations stages :
all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! 
these chains produce data which can be considered to be thoughts : and  can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : 
these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :

these tokens can be displayed or with held also a setting in the model ! 
### can this be applied in other areas ?

Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! 
it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !. 

#### AI AGI ?
so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way ) 


### Conclusion 

the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
the  take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which  in truth need to be autonomus ! 
ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also ! 

Fine tuning :  agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! 
in fact it should as this give  transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
hence an AGI ! 

# LOAD MODEL

```
! git clone https://github.com/huggingface/transformers.git
## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first: 
## THEN :
!cd transformers
!pip install  ./transformers

```

then restaet the environment: the model can then load without trust-remote and WILL work FINE ! 
it can even be trained : hence the 4 bit optimised version ::

``` Python


# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model.tokenizer = tokenizer

```