Tarek07 commited on
Commit
f8e7590
·
verified ·
1 Parent(s): 66fb4b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -16
README.md CHANGED
@@ -1,28 +1,38 @@
1
  ---
2
- base_model: []
 
 
 
 
 
 
3
  library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
-
8
  ---
9
- # Progenitor
 
 
 
 
10
 
11
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
13
  ## Merge Details
14
  ### Merge Method
15
 
16
- This model was merged using the [Linear DELLA](https://arxiv.org/abs/2406.11617) merge method using downloads/Llama-3.1-Nemotron-lorablated-70B as a base.
17
 
18
  ### Models Merged
19
 
20
  The following models were included in the merge:
21
- * downloads/Anubis-70B-v1
22
- * downloads/L3.1-70B-Hanami-x1
23
- * downloads/70B-L3.3-Cirrus-x1
24
- * downloads/Negative_LLAMA_70B
25
- * downloads/EVA-LLaMA-3.33-70B-v0.1
26
 
27
  ### Configuration
28
 
@@ -30,34 +40,34 @@ The following YAML configuration was used to produce this model:
30
 
31
  ```yaml
32
  models:
33
- - model: downloads/L3.1-70B-Hanami-x1
34
  parameters:
35
  weight: 0.20
36
  density: 0.7
37
- - model: downloads/70B-L3.3-Cirrus-x1
38
  parameters:
39
  weight: 0.20
40
  density: 0.7
41
- - model: downloads/Negative_LLAMA_70B
42
  parameters:
43
  weight: 0.20
44
  density: 0.7
45
- - model: downloads/Anubis-70B-v1
46
  parameters:
47
  weight: 0.20
48
  density: 0.7
49
- - model: downloads/EVA-LLaMA-3.33-70B-v0.1
50
  parameters:
51
  weight: 0.20
52
  density: 0.7
53
  merge_method: della_linear
54
- base_model: downloads/Llama-3.1-Nemotron-lorablated-70B
55
  parameters:
56
  epsilon: 0.2
57
  lambda: 1.1
58
  dtype: float32
59
  out_dtype: bfloat16
60
  tokenizer:
61
- source: downloads/Negative_LLAMA_70B
62
 
63
  ```
 
1
  ---
2
+ base_model:
3
+ - TheDrummer/Anubis-70B-v1
4
+ - EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
5
+ - meta-llama/Llama-3.3-70B-Instruct
6
+ - Sao10K/70B-L3.3-Cirrus-x1
7
+ - SicariusSicariiStuff/Negative_LLAMA_70B
8
+ - Sao10K/L3.1-70B-Hanami-x1
9
  library_name: transformers
10
  tags:
11
  - mergekit
12
  - merge
13
+ license: llama3.3
14
  ---
15
+ **Fixed Version**
16
+
17
+ Of all my findings in the Progenitor series, this is the culmination of all the best settings. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best.
18
+
19
+ # merge
20
 
21
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
 
23
  ## Merge Details
24
  ### Merge Method
25
 
26
+ This model was merged using the [Linear DELLA](https://arxiv.org/abs/2406.11617) merge method using [nbeerbower/Llama-3.1-Nemotron-lorablated-70B](https://huggingface.co/nbeerbower/Llama-3.1-Nemotron-lorablated-70B) as a base.
27
 
28
  ### Models Merged
29
 
30
  The following models were included in the merge:
31
+ * [TheDrummer/Anubis-70B-v1](https://huggingface.co/TheDrummer/Anubis-70B-v1)
32
+ * [EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1](https://huggingface.co/EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1)
33
+ * [Sao10K/70B-L3.3-Cirrus-x1](https://huggingface.co/Sao10K/70B-L3.3-Cirrus-x1)
34
+ * [SicariusSicariiStuff/Negative_LLAMA_70B](https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B)
35
+ * [Sao10K/L3.1-70B-Hanami-x1](https://huggingface.co/Sao10K/L3.1-70B-Hanami-x1)
36
 
37
  ### Configuration
38
 
 
40
 
41
  ```yaml
42
  models:
43
+ - model: Sao10K/L3.1-70B-Hanami-x1
44
  parameters:
45
  weight: 0.20
46
  density: 0.7
47
+ - model: Sao10K/70B-L3.3-Cirrus-x1
48
  parameters:
49
  weight: 0.20
50
  density: 0.7
51
+ - model: SicariusSicariiStuff/Negative_LLAMA_70B
52
  parameters:
53
  weight: 0.20
54
  density: 0.7
55
+ - model: TheDrummer/Anubis-70B-v1
56
  parameters:
57
  weight: 0.20
58
  density: 0.7
59
+ - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
60
  parameters:
61
  weight: 0.20
62
  density: 0.7
63
  merge_method: della_linear
64
+ base_model: nbeerbower/Llama-3.1-Nemotron-lorablated-70B
65
  parameters:
66
  epsilon: 0.2
67
  lambda: 1.1
68
  dtype: float32
69
  out_dtype: bfloat16
70
  tokenizer:
71
+ source: SicariusSicariiStuff/Negative_LLAMA_70B
72
 
73
  ```