File size: 2,744 Bytes
642bd31 7971f19 642bd31 7971f19 642bd31 7971f19 ab07d20 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
---
library_name: peft
license: apache-2.0
datasets:
- Yasbok/Alpaca_arabic_instruct
language:
- ar
pipeline_tag: text-generation
---
# ๐ Falcon-7b-QLoRA-alpaca-arabic
This repo contains a low-rank adapter for Falcon-7b fit on the Stanford Alpaca dataset Arabic version [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct).
## Model Summary
- **Model Type:** Causal decoder-only
- **Language(s):** Arabic
- **Base Model:** [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) (License: [Apache 2.0](https://huggingface.co/tiiuae/falcon-7b#license))
- **Dataset:** [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct)
- **License(s):** Apache 2.0 inherited from "Base Model"
## Model Details
The model was fine-tuned in 8-bit precision using ๐ค `peft` adapters, `transformers`, and `bitsandbytes`. Training relied on a method called QLoRA introduced in this [paper](https://arxiv.org/abs/2305.14314). The run took approximately 3 hours and was executed on a workstation with a single A100-SXM NVIDIA GPU with 37 GB of available memory.
### Model Date
June 10, 2023
### Recommendations
We recommend users of this model to develop guardrails and to take appropriate precautions for any production use.
## How to Get Started with the Model
### Setup
```python
# Install packages
!pip install -q -U bitsandbytes loralib einops
!pip install -q -U git+https://github.com/huggingface/transformers.git
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
```
### GPU Inference in 8-bit
This requires a GPU with at least 12 GB of memory.
### First, Load the Model
```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
# load the model
peft_model_id = "Ali-C137/falcon-7b-chat-alpaca-arabic"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(
config.base_model_name_or_path,
return_dict=True,
device_map={"":0},
trust_remote_code=True,
load_in_8bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token
model = PeftModel.from_pretrained(model, peft_model_id)
```
### CUDA Info
- CUDA Version: 12.0
- Hardware: 1 A100-SXM
- Max Memory: {0: "37GB"}
- Device Map: {"": 0}
### Package Versions Employed
- `torch`: 2.0.1+cu118
- `transformers`: 4.30.0.dev0
- `peft`: 0.4.0.dev0
- `accelerate`: 0.19.0
- `bitsandbytes`: 0.39.0
- `einops`: 0.6.1
### This work is highly inspired from [Daniel Furman](https://huggingface.co/dfurman)'s work, so Thanks a lot Daniel |