fullstack commited on
Commit
6dee3c4
1 Parent(s): 7f753b6

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -91
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  - merge
9
 
10
  ---
11
- # merged_output_ties_1_4
12
 
13
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
 
@@ -21,94 +21,7 @@ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge m
21
 
22
  The following models were included in the merge:
23
  * [unsloth/Qwen2.5-3B-Instruct](https://huggingface.co/unsloth/Qwen2.5-3B-Instruct)
24
- * triples/merged_model
25
- * genstruct/merged_model
26
- * kg/merged_model
27
 
28
- ### Configuration
29
-
30
- The following YAML configuration was used to produce this model:
31
-
32
- ```yaml
33
- models:
34
- # Base instructed model
35
- - model: unsloth/Qwen2.5-3B-Instruct
36
- parameters:
37
- weight: 1
38
- density: 1
39
-
40
- # Merged LoRA models
41
- - model: genstruct/merged_model
42
- parameters:
43
- weight: 1.0
44
- density: 1.0
45
-
46
- # - model: summary/merged_model
47
- # parameters:
48
- # weight: 1.0
49
- # density: 1.0
50
-
51
- - model: kg/merged_model
52
- parameters:
53
- weight: 1.0
54
- density: 1.0
55
-
56
- #### THIS BREAKS KG!!!
57
- # - model: pII/merged_model
58
- # parameters:
59
- # weight: 1.0
60
- # density: 1.0
61
-
62
- # #### Breaks KG!
63
- # - model: preference/merged_model
64
- # parameters:
65
- # weight: 1.0
66
- # density: 1.0
67
-
68
- - model: triples/merged_model
69
- parameters:
70
- weight: 1.0
71
- density: 1.0
72
-
73
- # - model: suitable/merged_model
74
- # parameters:
75
- # weight: 1.0
76
- # density: 1.0
77
-
78
- # - model: feedback/merged_model
79
- # parameters:
80
- # weight: 1.0
81
- # density: 1.0
82
-
83
- # Merge configuration
84
- merge_method: ties
85
- base_model: unsloth/Qwen2.5-3B
86
- parameters:
87
- normalize: true
88
- int8_mask: true
89
- dtype: bfloat16
90
-
91
- # # Tokenizer configuration
92
- # tokenizer_source: Qwen/Qwen1.5-14B-Chat
93
- # tokenizer_parameters:
94
- # trust_remote_code: true
95
-
96
- # # Output configuration
97
- # output:
98
- # precision: bfloat16
99
- # model_format: safetensors
100
- # max_shard_size: "4GB"
101
-
102
- # # Training configuration (for potential fine-tuning)
103
- # training:
104
- # learning_rate: 2e-5
105
- # warmup_steps: 100
106
- # gradient_checkpointing: true
107
- # gradient_accumulation_steps: 4
108
-
109
- # # Hardware optimization
110
- # hardware:
111
- # mixed_precision: true
112
- # cuda_memory_fraction: 0.95
113
- # optimize_model_memory: true
114
- ```
 
8
  - merge
9
 
10
  ---
11
+ # fmx-3b Augmentive Instruct (preview)
12
 
13
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
 
 
21
 
22
  The following models were included in the merge:
23
  * [unsloth/Qwen2.5-3B-Instruct](https://huggingface.co/unsloth/Qwen2.5-3B-Instruct)
24
+ * triples
25
+ * genstruct
26
+ * kg
27