LilyWinter
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -109,13 +109,13 @@ model-index:
|
|
109 |
|
110 |
# BoreanGale-70B
|
111 |
|
112 |
-
A merge using a custom algorithm of:
|
113 |
- [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
|
114 |
- [Sao10K/WinterGoddess-1.4x-70B-L2](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
|
115 |
|
116 |
<img src=https://huggingface.co/alchemonaut/BoreanGale-70B/resolve/main/bg.png>
|
117 |
|
118 |
-
|
119 |
|
120 |
This version of the model uses *t* = 0.001. At this *t*, about 10% of weights are fully switched to WinterGoddess. Model quality rapidly degrades above *t* = 0.0025:
|
121 |
|
@@ -125,6 +125,7 @@ This version of the model uses *t* = 0.001. At this *t*, about 10% of weights ar
|
|
125 |
- *t* = 0.005 (~35% full swap): Garbage; semi-related word lists
|
126 |
- *t* = 0.01 (~55% full swap): Garbage; pseudorandom tokens output
|
127 |
|
|
|
128 |
```
|
129 |
t: Union[float, np.ndarray],
|
130 |
v0: Union[np.ndarray, torch.Tensor],
|
|
|
109 |
|
110 |
# BoreanGale-70B
|
111 |
|
112 |
+
A merge using a custom algorithm (NearSwap) of:
|
113 |
- [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
|
114 |
- [Sao10K/WinterGoddess-1.4x-70B-L2](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
|
115 |
|
116 |
<img src=https://huggingface.co/alchemonaut/BoreanGale-70B/resolve/main/bg.png>
|
117 |
|
118 |
+
NearSwap retains most of the weights of the base model (Miqu), but when a weight is similar between the two, it is interpolated to the secondary model (WinterGoddess) value. A parameter *t* specifies the sameness threshold. When the distance between two values is below *t*, the weight from the secondary model (WinterGoddess) is used.
|
119 |
|
120 |
This version of the model uses *t* = 0.001. At this *t*, about 10% of weights are fully switched to WinterGoddess. Model quality rapidly degrades above *t* = 0.0025:
|
121 |
|
|
|
125 |
- *t* = 0.005 (~35% full swap): Garbage; semi-related word lists
|
126 |
- *t* = 0.01 (~55% full swap): Garbage; pseudorandom tokens output
|
127 |
|
128 |
+
NearSwap implementation:
|
129 |
```
|
130 |
t: Union[float, np.ndarray],
|
131 |
v0: Union[np.ndarray, torch.Tensor],
|