Depth pruned and fine tuned Llama-3.1-8B
Collection
5 items
•
Updated
This is a merge of meta-llama/Meta-Llama-3.1-8B created using mergekit, with respect to the paper "The Unreasonable Ineffectiveness of the Deeper Layers"
This model was merged using the passthrough merge method.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
dtype: bfloat16
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 23]
model: meta-llama/Meta-Llama-3.1-8B
- sources:
- layer_range: [28, 32]
model: meta-llama/Meta-Llama-3.1-8B
MMLU Pro 0-shot: 0.2642
[TIGER-AI-Lab/MMLU-Pro]
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).