BoreanGale-70B / README.md
LilyWinter's picture
Create README.md
14bdaac verified
|
raw
history blame
1.26 kB
metadata
tags:
  - merge

BoreanGale-70B

A merge using a custom algorithm of:

This merge retains most of the weights of Miqu, but when a weight is similar between the two, it is interpolated to the WinterGoddess value. A parameter t specifies the sameness threshold. When the distance between two values is below t, the weight from WinterGoddess is used.

This version of the model uses t = 0.001. t was selected so that very few but some weights are fully switched to WinterGoddess. Model quality rapidly degrades above t = 0.0025:

  • t = 0.001: This model
  • t = 0.0025: Generates one paragraph okay, but then reverts to garbage
  • t = 0.005: Garbage; semi-related word lists
  • t = 0.01: Garbage; pseudorandom tokens output
    t: Union[float, np.ndarray],
    v0: Union[np.ndarray, torch.Tensor],
    v1: Union[np.ndarray, torch.Tensor],
...
    lweight = numpy.absolute(v0-v1)
    lweight = t / lweight
    lweight = numpy.nan_to_num(lweight, nan=1.0, posinf=1.0, neginf=1.0)
    numpy.clip(lweight, a_min=0.0, a_max=1.0, out=lweight)
    res = lerp(lweight,v0,v1)