luciodery
/

Bonsai-PrunedPhi-1.8B

Structured Pruning

Memory-efficient Pruning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Bonsai-PrunedPhi-1.8B / README.md

luciodery's picture

Update README.md

8739442 verified 9 months ago

|

history blame contribute delete

2.65 kB

	---
	library_name: transformers
	tags:
	- Structured Pruning
	- Phi-2
	- Memory-efficient Pruning
	license: mit
	language:
	- en
	---

	# Model Card for Model ID

	We prune the Phi-2 (2.7B) model to 35% sparsty (1.8B) and then finetune on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4).
	Our pruning algorithm is described in the paper [Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes](https://arxiv.org/abs/2402.05406).
	[Code for pruning algorithm can be found here ](https://github.com/ldery/Bonsai/tree/main).

	## Model Details
	Model is derived from Pruning the [Phi-2 Model](https://huggingface.co/microsoft/phi-2)

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

	- Developed by: Lucio Dery, Steven Kolawole, Jean-François Kagy, Virginia Smith, Graham Neubig, Ameet Talwalkar
	- Model type: Decoder-only
	- Language(s) (NLP): English
	- License: MIT

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [https://github.com/ldery/Bonsai/tree/main]
	- Paper [optional]: [https://arxiv.org/abs/2402.05406]



	## Training Details

	### Training Data

	Finetuned on 100K 2048 length sequences from the C4 dataset (https://huggingface.co/datasets/c4).

	### Training Procedure

	Full fine-tuning.


	#### Training Hyperparameters

	Distillation KL-Weight : 0.01

	Learning Rate : 1e-4

	Batch Size : 128

	Optimzer : AdamW

	Warmup Steps : 5

	### License

	The model is licensed under the [MIT license](https://huggingface.co/luciodery/Bonsai-PrunedPhi-1.8B/blob/main/LICENSE).


	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: NVIDIA A6000

	## Citation


	BibTeX:

	@misc{dery2024everybody,
	title={Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes},
	author={Lucio Dery and Steven Kolawole and Jean-Francois Kagey and Virginia Smith and Graham Neubig and Ameet Talwalkar},
	year={2024},
	eprint={2402.05406},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}


	## Model Card Authors [optional]

	Lucio Dery: [email protected]

	## Model Card Contact

	[email protected]