File size: 3,244 Bytes
d1d6638
 
7987642
d1d6638
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed25397
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
library_name: peft
base_model: meta-llama/Llama-2-7b-hf
---
## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: float16

The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: float16
### Framework versions

- PEFT 0.5.0

- PEFT 0.5.0

- # Fine-Tuning Llama2-7b (Ludwig) for Code Generation

## Goal

The goal of this project is to fine-tune the Llama2-7b language model for code generation using the QLORA method. The model will take natural language as input, and should return code as output. We're first going to iterate on a base Llama-2-7b model with prompting, and finally instruction-fine-tune the model.

As an example, if we prompt the model with this instruction:

```
Instruction: Create an array of length 5 which contains all even numbers between 1 and 10.
```

We want the model to produce exactly this response:

```
Response: array = [2, 4, 6, 8, 10]
```

## QLORA Method for Fine-Tuning

The QLORA method for fine-tuning large language models (LLMs) is a parameter-efficient approach that uses 4-bit quantization to reduce the memory and computational requirements of fine-tuning. QLORA is implemented in the PEFT library, which is built on top of the Hugging Face Transformers library.

## Ludwig Data Format

Ludwig requires data to be in a specific format. The main components of the data format are:

- `input_features`: Defines the input features of the model. Each feature must have a `name` and `type`.
- `output_features`: Defines the output features of the model. Similar to input features, each output feature must have a `name` and `type`.

Here is an example of a simple Ludwig config:

```yaml
input_features:
  - name: instruction
    type: text
output_features:
  - name: output
    type: text
```

This config tells Ludwig to use the column called `instruction` in the dataset as an input feature and the `output` column as an output feature.

## Prerequisites

- Python 3.x
- Ludwig
- GPU (recommended for faster training)

## Setup and Installation

1. Clone the repository:
   ```sh
   git clone https://github.com/omid-sar/Llama2-7B-Fine-Tuning--Google-Colab-.git
   ```
2. Open the notebook `Fine_Tuning_Llama2_7b(_Ludwig).ipynb` in Google Colab or a local Jupyter environment.
3. Install the required libraries:
   ```sh
   pip install ludwig
   ```
4. Follow the instructions in the notebook to download any additional datasets or models.

## Usage

1. Run the cells in the notebook sequentially, following the instructions and comments provided.
2. Modify the model configuration, training parameters, or input data as needed to suit your use case.