ontocord
/

riverbed

Ontocord.AI commited on May 13, 2023

Commit

bffbe64

•

1 Parent(s): 9f3edac

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,7 +1,24 @@
 ---
 license: apache-2.0
 ---
 ```
 # test merged experts
 # TODO: add dynamic routing, testing better expert mixtures
@@ -86,6 +103,8 @@ for expert in ["orig_chat", "merged_chat_expert", "uspto_expert", "github_expert
   print (model1.generate_with_expert("Field of the Invention.\nAn electric toothbrush\n", tokenizer, expert=expert)[0])
 ```
 ```
 def recreate_merged_expert():

 ---
 license: apache-2.0
+tags:
+- MDEL
 ---
+# Model Name
+Multi-Domain-Expert-Layers/MDEL-theblackcat-chat-5-experts
+# Model Description
+This model was generated by averaging the weights of the following models
+- [Multi-Domain-Expert-Layers/expert-pubmed_central](https://huggingface.co/Multi-Domain-Expert-Layers/expert-pubmed_central)
+- [Multi-Domain-Expert-Layers/expert-freelaw](https://huggingface.co/Multi-Domain-Expert-Layers/expert-freelaw)
+- [Multi-Domain-Expert-Layers/expert-github](https://huggingface.co/Multi-Domain-Expert-Layers/expert-github)
+- [Multi-Domain-Expert-Layers/expert-uspto](https://huggingface.co/Multi-Domain-Expert-Layers/expert-uspto)
+- [Multi-Domain-Expert-Layers/expert-arxiv](https://huggingface.co/Multi-Domain-Expert-Layers/expert-arxiv)
+- [theblackcat102/pythia-1b-deduped-sft](theblackcat102/pythia-1b-deduped-sft)
+- We also keep a mixture that is primarily one of the above as an expert that can be loaded on demand.
+### NOTE: There is a mistake below where we are using a routed expert for pubmed-abstract, but we merged pubmed central
 ```
 # test merged experts
 # TODO: add dynamic routing, testing better expert mixtures
   print (model1.generate_with_expert("Field of the Invention.\nAn electric toothbrush\n", tokenizer, expert=expert)[0])
 ```
+### To recreate the expert, modify this script. We can also extend to do dynamic merging and/or experitment with different weights for different layers.
 ```
 def recreate_merged_expert():