alchemonaut
/

BoreanGale-70B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LilyWinter commited on Feb 3, 2024

Commit

c4df438

·

verified ·

1 Parent(s): 965b8d7

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -109,13 +109,13 @@ model-index:
 # BoreanGale-70B
-A merge using a custom algorithm of:
 - [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
 - [Sao10K/WinterGoddess-1.4x-70B-L2](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
 <img src=https://huggingface.co/alchemonaut/BoreanGale-70B/resolve/main/bg.png>
-This merge retains most of the weights of Miqu, but when a weight is similar between the two, it is interpolated to the WinterGoddess value. A parameter *t* specifies the sameness threshold. When the distance between two values is below *t*, the weight from WinterGoddess is used.
 This version of the model uses *t* = 0.001. At this *t*, about 10% of weights are fully switched to WinterGoddess. Model quality rapidly degrades above *t* = 0.0025:
@@ -125,6 +125,7 @@ This version of the model uses *t* = 0.001. At this *t*, about 10% of weights ar
 - *t* = 0.005  (~35% full swap): Garbage; semi-related word lists
 - *t* = 0.01   (~55% full swap): Garbage; pseudorandom tokens output
 ```
     t: Union[float, np.ndarray],
     v0: Union[np.ndarray, torch.Tensor],

 # BoreanGale-70B
+A merge using a custom algorithm (NearSwap) of:
 - [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
 - [Sao10K/WinterGoddess-1.4x-70B-L2](https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2)
 <img src=https://huggingface.co/alchemonaut/BoreanGale-70B/resolve/main/bg.png>
+NearSwap retains most of the weights of the base model (Miqu), but when a weight is similar between the two, it is interpolated to the secondary model (WinterGoddess) value. A parameter *t* specifies the sameness threshold. When the distance between two values is below *t*, the weight from the secondary model (WinterGoddess) is used.
 This version of the model uses *t* = 0.001. At this *t*, about 10% of weights are fully switched to WinterGoddess. Model quality rapidly degrades above *t* = 0.0025:
 - *t* = 0.005  (~35% full swap): Garbage; semi-related word lists
 - *t* = 0.01   (~55% full swap): Garbage; pseudorandom tokens output
+NearSwap implementation:
 ```
     t: Union[float, np.ndarray],
     v0: Union[np.ndarray, torch.Tensor],