LeroyDyer's picture
Update README.md
b5ca2e7 verified
metadata
license: mit
tags:
  - Mistral_Star
  - Mistral_Quiet
  - Mistral
  - Mixtral
  - Question-Answer
  - Token-Classification
  - Sequence-Classification
  - SpydazWeb-AI
  - chemistry
  - biology
  - legal
  - code
  - climate
  - medical
  - text-generation-inference
language:
  - en
  - sw
  - ig
  - zu
  - ca
  - es
  - pt
  - ha
pipeline_tag: text-generation

SpydazWeb AGI

( trained with heads ) : updated Codebase to Mistral-Nemo CodeBase : ( Perhaps i will make the nemo today !)

Training Note :

This is the base FP16 model ! ( very hard to get out! i had to use transformers only and NOT unsloth !): to only train with transformers ( the model needs to be on the a100 as it takes a super amount of memory ?? as it is a special model) I did manage to load and train the model with unsloth but the model did not Merge the lora ! to fp16 :

REASON :

Unsloth issues : they load thier own model if your loading a 16bit model ... and train the lora expecting you to merge it.. but if you use a 4bit model then the unsloth loads your exaact model !? you think.... WHAT? they download your own tensors but use another model ?? yes they have a mistral modelling file of thier own which is much more simple than transformers : more lighter weight ... so your customizations do not get loaded ? SO this file will have to be adjusted for me to full train each head intensly and test the outcomes correctly ... but its working fine !

SpydazWeb AI model :

This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind , who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world: A friendly interface with a personality caring and flirtatious at times : non binary !... and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents: the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.

After unsloth Training Warning

This model cannot be saved as 16bit curretly - SOON ! but it can be saved as 4bitMerged

https://github.com/spydaz

General Intenal Methods:

Trained for multi-task operations as well as rag and function calling :

This model is a fully functioning model and is fully uncensored:

* 32k context window (vs 8k context in v0.1)
* Rope-theta = 1e6
* No Sliding-Window Attention
* Talk heads  - produce resposnes which can be used towards the final output
* Pre-Thoughts  - Enables for pre-generation steps of potential artifacts for task solving: 
  * Generates plans for step by step thinking 
  * Generates python Code Artifacts for future tasks
  * Recalls context for task internally to be used as refference for task:
* show thoughts or hidden thought usages ( Simular to self-Rag )

the model has been trained on multiple datasets on the huggingface hub and kaggle :

the focus has been mainly on methodology :

  • Chain of thoughts
  • step by step planning
  • tree of thoughts
  • forest of thoughts
  • graph of thoughts
  • agent generation : Voting, ranking, ... dual agent response generation:

with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :

the model has been intensivly trained in recalling data previously entered into the matrix: The model has also been trained on rich data and markdown outputs as much as possible : the model can also generate markdown charts with mermaid.

Training Reginmes:

  • Alpaca
  • ChatML / OpenAI / MistralAI
  • Text Generation
  • Question/Answer (Chat)
  • Instruction/Input/Response (instruct)
  • Mistral Standard Prompt
  • Translation Tasks
  • Entitys / Topic detection
  • Book recall
  • Coding challenges, Code Feedback, Code Sumarization, Commenting Code
  • Agent Ranking and response anyalisis
  • Medical tasks
    • PubMed
    • Diagnosis
    • Psychaitry
    • Counselling
    • Life Coaching
    • Note taking
    • Medical smiles
    • Medical Reporting
  • Virtual laboritys simulations
  • Chain of thoughts methods
  • One shot / Multi shot prompting tasks

This model will be a custom model with internal experts and rag systems enabling for preprocessing of the task internally before outputting a response

This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)

Current Update : This model is working , AND TRAINED !!! to load the model it requires trust-remote=TRUE:: But also if it does not load then you need to clone the github:

Introduction :

STAR REASONERS !

this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output: the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output: so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response : Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages : all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways ! these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training : these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :

these tokens can be displayed or with held also a setting in the model !

can this be applied in other areas ?

Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags : these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor ! it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.

AI AGI ?

so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )

Conclusion

the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model : the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus ! ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !

Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model ! in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing ! hence an AGI !

LOAD MODEL

! git clone https://github.com/huggingface/transformers.git
## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first: 
## THEN :
!cd transformers
!pip install  ./transformers

then restaet the environment: the model can then load without trust-remote and WILL work FINE ! it can even be trained : hence the 4 bit optimised version ::



# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model.tokenizer = tokenizer