SUONG-4 (7B Parameters)

This is a merge of pre-trained language models created using LazyMergekit, combining the strengths of NeuralHermes and OpenHermes architectures through an optimized progressive fusion approach.

About Me

I'm David Soeiro-Vuong, a third-year Computer Science student working as an apprentice at TW3 Partners, a company specialized in Generative AI. Passionate about artificial intelligence and language models optimization, I focus on creating efficient model merges that balance performance and capabilities.

๐Ÿ”— Connect with me on LinkedIn

Merge Details

Merge Method

This model uses SLERP (Spherical Linear Interpolation) with a carefully tuned progressive fusion approach:

  • Progressive attention layer fusion (0 to 1)
  • Inverse MLP layer transition (1 to 0)
  • Global fusion ratio of 0.45
  • bfloat16 format for efficient memory usage

Models Merged

Configuration

slices:
  - sources:
      - model: mlabonne/NeuralHermes-2.5-Mistral-7B
        layer_range: [0, 32]
      - model: teknium/OpenHermes-2.5-Mistral-7B
        layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/NeuralHermes-2.5-Mistral-7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.3, 0.6, 0.9, 1]  
    - filter: mlp
      value: [1, 0.7, 0.4, 0.1, 0]  
    - value: 0.45 
dtype: bfloat16
Downloads last month
17
Safetensors
Model size
7.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Davidsv/SUONG-4