Merging Models of Different Parameter Size
#1
by
arhanovich
- opened
Do You Have a successful method for Merging Models of Different Parameter Size.
Kinda, I did some shenanigans with mergekit here in the breaking_math and weighting_interp branch, but I think it's just adding noise, I'm not sure if it's adding something useful to the small model, because in the open-llm-leaderboard it's degrading the model, and some tend to generate gibberish/(weird words) at some time.
Here I tried a bigger one teknium/OpenHermes-2-Mistral-7B with 40 layers + KoboldAI/LLaMA2-13B-Tiefighter, but I don't think it's working as intend.