metadata

language: en
license: apache-2.0

LoNAS Adapter Card: lonas-llama-7b-commonsense-adapter

The super-adapter-network fine-tuned on LLaMA-7B with some commonsense reasoning datasets using LoNAS.

Paper Abstract

Recently, several approaches successfully demonstrated that weight-sharing Neural Architecture Search (NAS) can effectively explore a search space of elastic low-rank adapters (LoRA), allowing the parameter-efficient fine-tuning (PEFT) and compression of large language models. In this paper, we introduce a novel approach called Shears, demonstrating how the integration of cost-effective sparsity and a proposed Neural Low-rank adapter Search (NLS) algorithm can further improve the efficiency of PEFT approaches. Results demonstrate the benefits of Shears compared to other methods, reaching high sparsity levels while improving or with little drop in accuracy, utilizing a single GPU for a pair of hours.

Model Details

Information

Adapter name: lonas-llama-7b-commonsense-adapter
Base model: LLaMA-7b
Domain: Commonsense
Subnetwork version: Super-network
NNCF Configuration: nncf_lonas_llama_7b.json

Adapter Configuration

LoRA rank: 32
LoRA alpha: 64
LoRA target modules: q_proj, k_proj, v_proj, up_proj, gate_proj, down_proj

Training Hyperparameters

Batch size: 16
Learning rate: 3e-4
Epoch: 6

Training Data

Unified commonsense reasoning dataset: commonsense_15k.json.

Evaluation Data

BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA.

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
    --dataset_path None \
    --model_name_or_path yahma/llama-7b-hf \
    --lora \
    --lora_weights lonas-llama-7b-commonsense \
    --nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
    --do_test \
    --output_dir lonas-llama-7b-commonsense/results

Evaluation Results

Results of the heuristic sub-network discoverd from the super-network:

Method	Total Params.	TFLOPs	BoolQ	PIQA	SIQA	HellaSwag	WinoG	Arc-e	Arc-c	OBQA	Average
LoRA	6.7B	1.7	62.6	75.3	67.9	52.9	58.6	79.2	58.3	71.2	65.8
LoNAS	5.6B	1.4	62.9	73.0	68.7	51.4	63.9	72.3	58.5	71.0	65.2

Model Sources

Repository: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS
Paper: LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models

Ethical Considerations

Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.

Ethical Considerations	Description
Data	The adapter was trained using the commonsense_15k data mixture as described above.
Human life	The model is not intended to inform decisions central to human life or flourishing.
Mitigations	No additional risk mitigation strategies were considered during model development.
Risks and harms	This model has not been assessed for harm or biases, and should not be used for sensitive applications where it may cause harm.
Use cases	-

Citation

@inproceedings{
munoz2024lonas,
title={LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
author={J. Pablo Muñoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation},
year={2024},
url={}
}

License

Apache-2.0