metadata
language: en
license: apache-2.0
LoNAS Adapter Card: lonas-llama-7b-commonsense-adapter
The super-adapter-network fine-tuned on LLaMA-7B with some commonsense reasoning datasets using LoNAS.
Model Details
Information
- Adapter name: lonas-llama-7b-commonsense-adapter
- Base model: LLaMA-7b
- Domain: Commonsense
- Subnetwork version: Super-network
- NNCF Configuration: nncf_lonas_llama_7b.json
Adapter Configuration
- LoRA rank: 32
- LoRA alpha: 64
- LoRA target modules: q_proj, k_proj, v_proj, up_proj, gate_proj, down_proj
Training Hyperparameters
- Batch size: 16
- Learning rate: 3e-4
- Epoch: 6
Training Data
Unified commonsense reasoning dataset: commonsense_15k.json.
Evaluation Data
BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA.
How to use
Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:
CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
--dataset_path None \
--model_name_or_path yahma/llama-7b-hf \
--lora \
--lora_weights lonas-llama-7b-commonsense \
--nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
--do_test \
--output_dir lonas-llama-7b-commonsense/results
Evaluation Results
Results of the heuristic sub-network discoverd from the super-network:
Method | Total Params. | TFLOPs | BoolQ | PIQA | SIQA | HellaSwag | WinoG | Arc-e | Arc-c | OBQA | Average |
---|---|---|---|---|---|---|---|---|---|---|---|
LoRA | 6.7B | 1.7 | 62.6 | 75.3 | 67.9 | 52.9 | 58.6 | 79.2 | 58.3 | 71.2 | 65.8 |
LoNAS | 5.6B | 1.4 | 62.9 | 73.0 | 68.7 | 51.4 | 63.9 | 72.3 | 58.5 | 71.0 | 65.2 |
Model Sources
- Repository: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS
- Paper: LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models
Ethical Considerations
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
Ethical Considerations | Description |
---|---|
Data | The adapter was trained using the commonsense_15k data mixture as described above. |
Human life | The model is not intended to inform decisions central to human life or flourishing. |
Mitigations | No additional risk mitigation strategies were considered during model development. |
Risks and harms | This model has not been assessed for harm or biases, and should not be used for sensitive applications where it may cause harm. |
Use cases | - |
Citation
@inproceedings{
munoz2024lonas,
title={LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
author={J. Pablo Muñoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation},
year={2024},
url={}
}
License
Apache-2.0