What software was used?

#2
by Jcuhfehl - opened

I tried reproducing a similar merged mistral model using mergekit, but couldn't get it running in llamacpp. I could get it to work properly with your model, so my conversion process doesn't seem be be wrong. What software did you use to create this, or did you use any special settings?

No special setting, mergekit got updated and you need to use another format, the new one should be like that :

slices:
  - sources:
    - model: Norquinal/PetrolLM
      layer_range: [0, 16]
  - sources:
    - model: Undi95/Mistral-LimaRP-v3-7B
      layer_range: [8, 20]
  - sources:
    - model: Norquinal/PetrolLM
      layer_range: [17, 22]
  - sources:
    - model: Undi95/Mistral-LimaRP-v3-7B
      layer_range: [21, 26]
  - sources:
    - model: Norquinal/PetrolLM
      layer_range: [23, 30]
  - sources:
    - model: Undi95/Mistral-LimaRP-v3-7B
      layer_range: [27, 32]
merge_method: passthrough
dtype: float16

Save this in merge.yaml and run it trough mergekit, you should have the same result.
Modify it if needed with model you want.
If the outputted model don't work, you probably use model that is too different, or use different architecture, but in that case mergekit should warn you. What model did you tried to merge?

Thanks, I got it working with your config.

Jcuhfehl changed discussion status to closed

Sign up or log in to comment