Update README

9c85878 verified 8 months ago

3.7 kB

	---
	language: en
	license: apache-2.0
	---

	# LoNAS Adapter Card: lonas-llama-7b-commonsense-adapter

	The super-adapter-network fine-tuned on LLaMA-7B with some commonsense reasoning datasets using LoNAS.

	## Model Details

	### Information

	- Adapter name: lonas-llama-7b-commonsense-adapter
	- Base model: [LLaMA-7b](https://huggingface.co/yahma/llama-7b-hf)
	- Domain: Commonsense
	- Subnetwork version: Super-network
	- NNCF Configuration: [nncf_lonas_llama_7b.json](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS/nncf_config/unified_commonsense/nncf_lonas_llama_7b.json)

	### Adapter Configuration

	- LoRA rank: 32
	- LoRA alpha: 64
	- LoRA target modules: q_proj, k_proj, v_proj, up_proj, gate_proj, down_proj

	### Training Hyperparameters

	- Batch size: 16
	- Learning rate: 3e-4
	- Epoch: 6

	### Training Data

	Unified commonsense reasoning dataset: [commonsense_15k.json](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/ft-training_set/commonsense_15k.json).

	### Evaluation Data
	[BoolQ](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/boolq/test.json), [PIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/piqa/test.json), [SIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/social_i_qa/test.json), [HellaSwag](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/hellaswag/test.json), [WinoGrande](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/winogrande/test.json), [ARC-e](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Easy/test.json), [ARC-c](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Challenge/test.json), [OBQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/openbookqa/test.json).


	## How to use

	Refer to [https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation):
	```bash
	CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
	--dataset_path None \
	--model_name_or_path yahma/llama-7b-hf \
	--lora \
	--lora_weights lonas-llama-7b-commonsense \
	--nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
	--do_test \
	--output_dir lonas-llama-7b-commonsense/results
	```

	## Evaluation Results

	Results of the heuristic sub-network discoverd from the super-network:

	\| Method \| Total Params. \| TFLOPs \| BoolQ \| PIQA \| SIQA \| HellaSwag \| WinoG \| Arc-e \| Arc-c \| OBQA \| Average \|
	\|-------------\|----------------\|-----------\|-------\|------\|------\|-----------\|-------\|-------\|-------\|------\|----------------\|
	\| LoRA \| 6.7B \| 1.7 \| 62.6 \| 75.3 \| 67.9 \| 52.9 \| 58.6 \| 79.2 \| 58.3 \| 71.2 \| 65.8 \|
	\| LoNAS \| 5.6B \| 1.4 \| 62.9 \| 73.0 \| 68.7 \| 51.4 \| 63.9 \| 72.3 \| 58.5 \| 71.0 \| 65.2 \|


	## Model Sources

	- Repository: [https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS)
	- Paper: [LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models]()

	## Citation

	```bibtex
	@inproceedings{
	munoz2024lonas,
	title={LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
	author={J. Pablo Muñoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
	booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation},
	year={2024},
	url={}
	}
	```

	## License

	Apache-2.0