Ontocord.AI commited on
Commit
bffbe64
1 Parent(s): 9f3edac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -1,7 +1,24 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ```
6
  # test merged experts
7
  # TODO: add dynamic routing, testing better expert mixtures
@@ -86,6 +103,8 @@ for expert in ["orig_chat", "merged_chat_expert", "uspto_expert", "github_expert
86
  print (model1.generate_with_expert("Field of the Invention.\nAn electric toothbrush\n", tokenizer, expert=expert)[0])
87
  ```
88
 
 
 
89
  ```
90
 
91
  def recreate_merged_expert():
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - MDEL
5
  ---
6
 
7
+ # Model Name
8
+ Multi-Domain-Expert-Layers/MDEL-theblackcat-chat-5-experts
9
+
10
+ # Model Description
11
+ This model was generated by averaging the weights of the following models
12
+ - [Multi-Domain-Expert-Layers/expert-pubmed_central](https://huggingface.co/Multi-Domain-Expert-Layers/expert-pubmed_central)
13
+ - [Multi-Domain-Expert-Layers/expert-freelaw](https://huggingface.co/Multi-Domain-Expert-Layers/expert-freelaw)
14
+ - [Multi-Domain-Expert-Layers/expert-github](https://huggingface.co/Multi-Domain-Expert-Layers/expert-github)
15
+ - [Multi-Domain-Expert-Layers/expert-uspto](https://huggingface.co/Multi-Domain-Expert-Layers/expert-uspto)
16
+ - [Multi-Domain-Expert-Layers/expert-arxiv](https://huggingface.co/Multi-Domain-Expert-Layers/expert-arxiv)
17
+ - [theblackcat102/pythia-1b-deduped-sft](theblackcat102/pythia-1b-deduped-sft)
18
+ - We also keep a mixture that is primarily one of the above as an expert that can be loaded on demand.
19
+
20
+ ### NOTE: There is a mistake below where we are using a routed expert for pubmed-abstract, but we merged pubmed central
21
+
22
  ```
23
  # test merged experts
24
  # TODO: add dynamic routing, testing better expert mixtures
 
103
  print (model1.generate_with_expert("Field of the Invention.\nAn electric toothbrush\n", tokenizer, expert=expert)[0])
104
  ```
105
 
106
+ ### To recreate the expert, modify this script. We can also extend to do dynamic merging and/or experitment with different weights for different layers.
107
+
108
  ```
109
 
110
  def recreate_merged_expert():