Merging Models of Different Parameter Size

by arhanovich - opened Feb 5

Discussion

arhanovich

Feb 5

Do You Have a successful method for Merging Models of Different Parameter Size.

Aryanne

Owner Feb 5

•

edited Feb 5

Kinda, I did some shenanigans with mergekit here in the breaking_math and weighting_interp branch, but I think it's just adding noise, I'm not sure if it's adding something useful to the small model, because in the open-llm-leaderboard it's degrading the model, and some tend to generate gibberish/(weird words) at some time.

Here I tried a bigger one teknium/OpenHermes-2-Mistral-7B with 40 layers + KoboldAI/LLaMA2-13B-Tiefighter, but I don't think it's working as intend.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment