jpablomch's picture
Update README
9c85878 verified
|
raw
history blame
3.7 kB
metadata
language: en
license: apache-2.0

LoNAS Adapter Card: lonas-llama-7b-commonsense-adapter

The super-adapter-network fine-tuned on LLaMA-7B with some commonsense reasoning datasets using LoNAS.

Model Details

Information

  • Adapter name: lonas-llama-7b-commonsense-adapter
  • Base model: LLaMA-7b
  • Domain: Commonsense
  • Subnetwork version: Super-network
  • NNCF Configuration: nncf_lonas_llama_7b.json

Adapter Configuration

  • LoRA rank: 32
  • LoRA alpha: 64
  • LoRA target modules: q_proj, k_proj, v_proj, up_proj, gate_proj, down_proj

Training Hyperparameters

  • Batch size: 16
  • Learning rate: 3e-4
  • Epoch: 6

Training Data

Unified commonsense reasoning dataset: commonsense_15k.json.

Evaluation Data

BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA.

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_commonsense.py \
    --dataset_path None \
    --model_name_or_path yahma/llama-7b-hf \
    --lora \
    --lora_weights lonas-llama-7b-commonsense \
    --nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
    --do_test \
    --output_dir lonas-llama-7b-commonsense/results

Evaluation Results

Results of the heuristic sub-network discoverd from the super-network:

Method Total Params. TFLOPs BoolQ PIQA SIQA HellaSwag WinoG Arc-e Arc-c OBQA Average
LoRA 6.7B 1.7 62.6 75.3 67.9 52.9 58.6 79.2 58.3 71.2 65.8
LoNAS 5.6B 1.4 62.9 73.0 68.7 51.4 63.9 72.3 58.5 71.0 65.2

Model Sources

Citation

@inproceedings{
munoz2024lonas,
title={LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
author={J. Pablo Muñoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
booktitle={The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation},
year={2024},
url={}
}

License

Apache-2.0