LeroyDyer
/

LCARS_AI_QstaR_Nemo_GGUF

Model card Files Files and versions Community

LeroyDyer commited on Jul 21, 2024

Commit

20dc4fd

verified ·

1 Parent(s): 81bf2f5

Update README.md

Browse files

Files changed (1) hide show

README.md +108 -9

README.md CHANGED Viewed

@@ -1,22 +1,121 @@
 ---
 base_model: unsloth/mistral-nemo-instruct-2407-bnb-4bit
-language:
-- en
 license: apache-2.0
 tags:
 - text-generation-inference
-- transformers
-- unsloth
-- mistral
-- gguf
 ---
-# Uploaded  model
 - **Developed by:** LeroyDyer
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/mistral-nemo-instruct-2407-bnb-4bit
-This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 base_model: unsloth/mistral-nemo-instruct-2407-bnb-4bit
 license: apache-2.0
 tags:
+- Mistral_Star
+- Mistral_Quiet
+- Mistral
+- Mixtral
+- Question-Answer
+- Token-Classification
+- Sequence-Classification
+- SpydazWeb-AI
+- chemistry
+- biology
+- legal
+- code
+- climate
+- medical
 - text-generation-inference
+language:
+- en
+- sw
+- ig
+- zu
+- ca
+- es
+- pt
+- ha
 ---
+# Spydaz WEB AI
+## Model Architecture
+Mistral Nemo is a transformer model, with the following architecture choices:
+- **Layers:** 40
+- **Dim:** 5,120
+- **Head dim:** 128
+- **Hidden dim:** 14,436
+- **Activation Function:** SwiGLU
+- **Number of heads:** 32
+- **Number of kv-heads:** 8 (GQA)
+- **Vocabulary size:** 2**17 ~= 128k
+- **Rotary embeddings (theta = 1M)**
 - **Developed by:** LeroyDyer
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/mistral-nemo-instruct-2407-bnb-4bit
+<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
+https://github.com/spydaz
+# Introduction :
+## STAR REASONERS !
+this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
+the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
+so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
+Another head could also be dedicated to retrieving content  based on the query from the self which can also be used in the pregenerations stages :
+all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
+these chains produce data which can be considered to be thoughts : and  can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
+these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
+these tokens can be displayed or with held also a setting in the model !
+### can this be applied in other areas ?
+Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
+these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
+it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.
+#### AI AGI ?
+so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )
+### Conclusion
+the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
+the  take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which  in truth need to be autonomus !
+ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !
+Fine tuning :  agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
+in fact it should as this give  transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
+hence an AGI !
+# LOAD MODEL
+```
+! git clone https://github.com/huggingface/transformers.git
+## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first:
+## THEN :
+!cd transformers
+!pip install  ./transformers
+```
+then restaet the environment: the model can then load without trust-remote and WILL work FINE !
+it can even be trained : hence the 4 bit optimised version ::
+``` Python
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
+model.tokenizer = tokenizer
+```