Update README.md
Browse files
README.md
CHANGED
@@ -19,9 +19,46 @@ with permissive licenses, namely MIT and Apache 2.0. This set of code was furthe
|
|
19 |
|
20 |
# Training setting and hyperparameters
|
21 |
For our fine-tuning, we decided to follow a 2-step strategy.
|
22 |
-
- Pretraining (Fine-tuning) with next token prediction on the previously built gradio dataset (this step should familiarize the model with the gradio syntax.)
|
23 |
-
- Instruction fine-tuning on an instruction
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
# Training setting and hyperparameters
|
21 |
For our fine-tuning, we decided to follow a 2-step strategy.
|
22 |
+
- Pretraining (Fine-tuning) with next token prediction on the previously built gradio dataset (this step should familiarize the model with the gradio syntax.).
|
23 |
+
- Instruction fine-tuning on an instruction dataset (this step should make the model conversational.).
|
24 |
+
For both steps, we made use of parameter-efficient fine-tuning via the library [PEFT](https://github.com/huggingface/peft), more precisely [LoRa](https://arxiv.org/abs/2106.09685). Our
|
25 |
+
training script is the famous [starcoder fine-tuning script](https://github.com/bigcode-project/starcoder).
|
26 |
+
|
27 |
+
## Resources
|
28 |
+
Our training was done of 8 A100 GPUs of 80GB.
|
29 |
+
|
30 |
+
## Pretraining
|
31 |
+
These are the parameters that we used :
|
32 |
+
- learning rate : 5e-4
|
33 |
+
- warmup_steps :
|
34 |
+
- gradient_accumulation_steps : 4
|
35 |
+
- batch_size : 1
|
36 |
+
- sequence length : 2048
|
37 |
+
- max_steps : 1000
|
38 |
+
- warmup_steps : 5
|
39 |
+
- weight_decay : 0.05
|
40 |
+
|
41 |
+
LORA PARAMETERS :
|
42 |
+
- r = 16
|
43 |
+
- alpha = 32
|
44 |
+
- dropout = 0.05
|
45 |
+
|
46 |
+
We stopped the training before the end and kept the *checkpoint-100* for the second step.
|
47 |
+
|
48 |
+
## Fine-tuning
|
49 |
+
This step consisted into the instruction fine-tuning of the previous checkpoint. For that purpose, we used a modified version of [openassistant-guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco).
|
50 |
+
The template for the instruction fine-tuning was `Question: {question}\n\nAnswer: {answer}`. We used exactly the same parameters we used during the pretraining and we kept the *checkpoint-50*.
|
51 |
+
|
52 |
+
## Usage
|
53 |
+
The usage is straightforward an very similar to any other instruction fine-tuned model
|
54 |
+
|
55 |
+
```python
|
56 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
57 |
+
checkpoint_name="ArmelR/starcoder-gradio-v0"
|
58 |
+
model = AutoModelForCausalLM.from_pretrained(checkpoint_name)
|
59 |
+
tokenizer = AutoTokenizer.from_pretrained(checkpoint_name)
|
60 |
+
prompt = "Create a gradio application that help to convert temperature in celcius into temperature in Fahrenheit"
|
61 |
+
inputs = tokenizer(f"Question: {prompt}\n\nAnswer: ", return_tensors="pt")
|
62 |
+
outputs = model.generate(inputs["input_ids"], temperature=0.2, top_p=0.95)
|
63 |
+
print(tokenizer.decode(outputs))
|
64 |
+
```
|