---
library_name: transformers
tags:
- Structured Pruning
- Phi-2
- Memory-efficient Pruning
license: mit
language:
- en
---

# Model Card for Model ID

We prune the Phi-2 (2.7B) model to 35% sparsty (1.8B) and then finetune on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4).
Our pruning algorithm is described in the paper [Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes](https://arxiv.org/abs/2402.05406). 
[Code for pruning algorithm can be found here ](https://github.com/ldery/Bonsai/tree/main).

## Model Details
Model is derived from Pruning the  [Phi-2 Model](https://huggingface.co/microsoft/phi-2)

### Model Description

<!-- Provide a longer summary of what this model is. -->

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

- **Developed by:** Lucio Dery, Steven Kolawole, Jean-François Kagy, Virginia Smith, Graham Neubig, Ameet Talwalkar
- **Model type:** Decoder-only
- **Language(s) (NLP):** English
- **License:** MIT

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [https://github.com/ldery/Bonsai/tree/main]
- **Paper [optional]:** [https://arxiv.org/abs/2402.05406]


## Training Details

### Training Data

Finetuned on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4).

### Training Procedure 

Full fine-tuning.  


#### Training Hyperparameters

Distillation KL-Weight :  0.01

Learning Rate : 1e-4

Batch Size : 128

Optimzer : AdamW

Warmup Steps : 5

### License

The model is licensed under the [MIT license](https://huggingface.co/luciodery/Bonsai-PrunedPhi-1.8B/blob/main/LICENSE).


## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** NVIDIA A6000

## Citation


**BibTeX:**

@misc{dery2024everybody,
      title={Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes}, 
      author={Lucio Dery and Steven Kolawole and Jean-Francois Kagey and Virginia Smith and Graham Neubig and Ameet Talwalkar},
      year={2024},
      eprint={2402.05406},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}


## Model Card Authors [optional]

Lucio Dery: ldery@andrew.cmu.edu

## Model Card Contact

ldery@andrew.cmu.edu