Nuslerp parameters?

#1
by sometimesanotion - opened

This nuslerp is fascinating, and new to me. I like that you adopted my slices-oriented YAML, which I got initial inspiration for in @CultriX 's EvolMerge models. I've been developing it further; it helps target layers for specific gradients, and helps run merges with less CUDA memory.

I also see you are combining breadcrumbs with another merge strategy! Neat!

I'm curious to know if gradients like

        parameters:
          weight: [ 0.35, 30 ]
          nuslerp_flatten: false
          nuslerp_row_wise: true

...will give you more reliable results. I'm used to regular slerp with its t parameter!

This nuslerp is fascinating, and new to me. I like that you adopted my slices-oriented YAML, which I got initial inspiration for in @CultriX 's EvolMerge models. I've been developing it further; it helps target layers for specific gradients, and helps run merges with less CUDA memory.

I also see you are combining breadcrumbs with another merge strategy! Neat!

I'm curious to know if gradients like

        parameters:
          weight: [ 0.35, 30 ]
          nuslerp_flatten: false
          nuslerp_row_wise: true

...will give you more reliable results. I'm used to regular slerp with its t parameter!

There's a lot that can be said about this mathematically. I will later give you a "roughly reasonable" explanation based on mathematical properties. This explanation will be long, but please give me some time to make it sufficiently reasonable.

Sign up or log in to comment