--- base_model: LeroyDyer/_Spydaz_Web_AI_ALPACA license: mit tags: - Mistral_Star - Mistral_Quiet - Mistral - Mixtral - Question-Answer - Token-Classification - Sequence-Classification - SpydazWeb-AI - chemistry - biology - legal - code - climate - medical - text-generation-inference - not-for-all-audiences - chain-of-thought - tree-of-knowledge - forest-of-thoughts - visual-spacial-sketchpad - alpha-mind - knowledge-graph - entity-detection - encyclopedia - wikipedia - stack-exchange - Reddit - Cyber-series - MegaMind - Cybertron - SpydazWeb - Spydaz - LCARS - star-trek - mega-transformers - Mulit-Mega-Merge - Multi-Lingual - Afro-Centric - African-Model - Ancient-One datasets: - gretelai/synthetic_text_to_sql - HuggingFaceTB/cosmopedia - teknium/OpenHermes-2.5 - Open-Orca/SlimOrca - Open-Orca/OpenOrca - cognitivecomputations/dolphin-coder - databricks/databricks-dolly-15k - yahma/alpaca-cleaned - uonlp/CulturaX - mwitiderrick/SwahiliPlatypus - swahili - Rogendo/English-Swahili-Sentence-Pairs - ise-uiuc/Magicoder-Evol-Instruct-110K - meta-math/MetaMathQA - abacusai/ARC_DPO_FewShot - abacusai/MetaMath_DPO_FewShot - abacusai/HellaSwag_DPO_FewShot - HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset - HuggingFaceFW/fineweb - occiglot/occiglot-fineweb-v0.5 - omi-health/medical-dialogue-to-soap-summary - keivalya/MedQuad-MedicalQnADataset - ruslanmv/ai-medical-dataset - Shekswess/medical_llama3_instruct_dataset_short - ShenRuililin/MedicalQnA - virattt/financial-qa-10K - PatronusAI/financebench - takala/financial_phrasebank - Replete-AI/code_bagel - athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW - IlyaGusev/gpt_roleplay_realm - rickRossie/bluemoon_roleplay_chat_data_300k_messages - jtatman/hypnosis_dataset - Hypersniper/philosophy_dialogue - Locutusque/function-calling-chatml - bible-nlp/biblenlp-corpus - DatadudeDev/Bible - Helsinki-NLP/bible_para - HausaNLP/AfriSenti-Twitter - aixsatoshi/Chat-with-cosmopedia - HuggingFaceTB/cosmopedia-100k - HuggingFaceFW/fineweb-edu - m-a-p/CodeFeedback-Filtered-Instruction - heliosbrahma/mental_health_chatbot_dataset language: - en - sw - ig - so - es - ca - xh - zu - ha - tw - af - hi - bm - su --- # Introduction : ## SpydazWeb AI model : This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind , who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world: A friendly interface with a personality caring and flirtatious at times : non binary !... and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents: the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys. https://github.com/spydaz ## CURRENT MODEL : NOTES : Recent additions ( implements ChatML Template / Trained for PLANNING!! ) This model is pretty stable as a model i have tested my past questions and answers the model retains its knowledge it seems very calm ! I tested if it can make timelines : it was witholding of information but after drilling for more infor the model gave up very good timelines : I shall actually over fit my past timelines and charts into the model ( i have recently been pushing the emebeddings also whilst training , ( also because of the new languges i have been adding to the model enabling for the new languge data to find relativity or these tasks wil not produce the same results as the same question in uglish) ) in fact actually this may be a good starting point for other models : past pardigms are very deeply embedded : i have also reduced theuse of the world archive prompt , which was also resurfaceing in some outputs even when not soclicited : it also seem to have lost personality also ? and become a bit serious ! this may also be due to these hermes and orca datasets which might be regressing the model slightly ! i will search for more role play and conversive datasets and fine tune these conversations as its code gene and funciton use etc is fine and will not accept training due to be highly fit ! A few steps down the line i will return to theh regular training set up ( without touching the embedding and just training the model : ) * 32k context window (vs 8k context in v0.1) * Rope-theta = 1e6 * No Sliding-Window Attention This model will be a custom model with internal experts and rag systems enabling for preprocessing of the task internally before outputting a response : This is based on the Quiet Star Project : which was abandoned earlier in the year :) ### General Intenal Methods: Trained for multi-task operations as well as rag and function calling : This model is a fully functioning model and is fully uncensored: the model has been trained on multiple datasets on the huggingface hub and kaggle : the focus has been mainly on methodology : * Chain of thoughts * step by step planning * tree of thoughts * forest of thoughts * graph of thoughts * agent generation : Voting, ranking, ... dual agent response generation: with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks : the model has been intensivly trained in recalling data previously entered into the matrix: The model has also been trained on rich data and markdown outputs as much as possible : the model can also generate markdown charts with mermaid. ## Training Reginmes: * Alpaca * ChatML / OpenAI / MistralAI * Text Generation * Question/Answer (Chat) * Planner * Instruction/Input/Response (instruct) * Mistral Standard Prompt * Translation Tasks * Entitys / Topic detection * Book recall * Coding challenges, Code Feedback, Code Sumarization, Commenting Code, code planning and explanation: Software generation tasks * Agent Ranking and response anyalisis * Medical tasks * PubMed * Diagnosis * Psychaitry * Counselling * Life Coaching * Note taking * Medical smiles * Medical Reporting * Virtual laboritys simulations * Chain of thoughts methods * One shot / Multi shot prompting tasks ## Training Paradigms:: ( 1 year of training 365 merges later (all with the preious parents )) ### GENETIC MERGES TO PRESERVE THE PAST TASKS AND SUPER FINE TUNING * A series of merges (Genetic algorithm Merge) - multiple merge targets chose to create the x+y+z models : utilizing the current series and previous series of models : * hence the LCARS/ARCHIVE models have been remerged into the world archive models : hence the prompt used was to enable the model to learn books and recall books and pasages: as well as recount the historys of mankind: # Higher Settigs in generations to allow for sparse and indepth training of information: METHODS OF TRAINING * highly trained on multiple facets and content ! - i use 1000topK, 0.42topK, 0.2Temp :hence drawing on a large results pool and selecting from a higher likelyhood of truth ..with a low temp to allow for accuracy of results .. to allow for more expression just up temp tiny... as i train my role play in deep to be a part of the models core ! * varing training generation setting config file allow for the traier to utilise these setting during training - here we can know that with role training to train sparsely to not effect the final language or model inputs; raising the temp and topK sampling. * during training very indepth methodolgys and chains are used with single and multi shot prompting to train the model to use chains , of thoughts , creating multiple reponses and selecting the highest , forest of thoughts , tree of knowledge: * during training data was reshaped and context was replicated or sumarized into thioughts , step by step instructions were generated as thoughts , these thoughts were used as a visual spacial scratch pad to solve tasks and follow steps to produce the final response: * a virtual agent system was designed for programming tasks and softeware geeration: so the model will generate coders and documenters and systems designers and dat formatters to design the components iternally and colate the outputs to produce the final response: aand output: this is performed internally in the thought workspace: * the model was also trained to become the expert for any feild required for the task ; such as geerating a virtual lab with tools to perform virtual chemesrty and biology experiments ... replicating wexpected results , which is great for exploring how things work together and the requirements to perform a task i which you do ot have the materials or may even be dangerous: * the model was given aspects of a character to emulate and questioned in character regarding the life and times, achivenments and discoverys of the character and the dialogues genenrated were installed into the model: this really give the model the ability to mimic characters and even give you perspectives of people of antiquity .. from th eperspective of litriture known of movie dialogues :, so many subtitled move dialogues were extracted and just the single chaacter files were intalled in ot the model sfor each chaeracter ... all were given to give the moel character and histrical perspective: * projects were generated with the model and discussoions and tasks for building timelines were acomplished together with books and sacred texts and bibles and histrical books as well as obscure storys to generate to=ime lies for fictitious and mythological and real history civilisations: great tool for undertanding history and mythological texts . these texts were highly trained into the model ... as well as all ancient texts and docuemtarys were added on these subject such as acient alines and flat earth and black history and ....all the classical greek and roman texts etc were added , chat gpt does not have this information as it is secular to pop knowledge * *## DEEP MIND SERIES : the dep mind series was higfhly focuessed on solving university style questions with higher requirements and metholodolgys: this seris was highly focused with oneshot and multushot cometition data , such as the leader board datasets and concepts form other leader boards being used ... hence in fact how can it fail such tests ? as it has beeing highly traied on such task and has many examples ... some how the scores seemed to go down.. and even the socres stopped comming after submitting the latest great models after deep mind! my highest moodel was 79 % average :: but somehow no other modle i created was able to supass this model ??? very suspect ! i also played around with the top models merging and saw no progress and realised the leaderboard has crashed! ## Cyber Series : highly focused on creating experts for specific roles : breaking specific roles and data into s=models and training them as agents to be merged into my mixture of experts model which also was agrea model but nivida intensive, and was it actually better than my 7b ! i played around with 3b models and 1b models for trainign abilitys on my home pc but in truth they were very hard to catch up to my model of which i test before i use chat gpt for answers... finding myself returning to my own model ... going to chat gpt to generate data conversations and this onesided opionion, so that it can be pushed in to the parwent model !! ## Librarian the model has been trained to recall complete books or documents previously uploaded to the library , hence a series of training of unstructured book corpuses as well as structured books as well as recalling previously entered books. somebooks were not added but simply recalled until there was a match of loss 0.5 or below ... this technique has been used to add documentarys sacred books, romantic novels, sci fi books and other fictional books as well as cultural storys i various langugages . for the recall task it was performing well but we are still unsure of it to be able to generalise this task .. but maybe with some grokkng it will generalise.. * * :