Nuslerp parameters?
This nuslerp is fascinating, and new to me. I like that you adopted my slices-oriented YAML, which I got initial inspiration for in @CultriX 's EvolMerge models. I've been developing it further; it helps target layers for specific gradients, and helps run merges with less CUDA memory.
I also see you are combining breadcrumbs with another merge strategy! Neat!
I'm curious to know if gradients like
parameters:
weight: [ 0.35, 30 ]
nuslerp_flatten: false
nuslerp_row_wise: true
...will give you more reliable results. I'm used to regular slerp with its t parameter!
This nuslerp is fascinating, and new to me. I like that you adopted my slices-oriented YAML, which I got initial inspiration for in @CultriX 's EvolMerge models. I've been developing it further; it helps target layers for specific gradients, and helps run merges with less CUDA memory.
I also see you are combining breadcrumbs with another merge strategy! Neat!
I'm curious to know if gradients like
parameters: weight: [ 0.35, 30 ] nuslerp_flatten: false nuslerp_row_wise: true
...will give you more reliable results. I'm used to regular slerp with its t parameter!
There's a lot that can be said about this mathematically. I will later give you a "roughly reasonable" explanation based on mathematical properties. This explanation will be long, but please give me some time to make it sufficiently reasonable.