ArmelR commited on
Commit
b7bd64d
1 Parent(s): c16bd6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -6
README.md CHANGED
@@ -19,9 +19,46 @@ with permissive licenses, namely MIT and Apache 2.0. This set of code was furthe
19
 
20
  # Training setting and hyperparameters
21
  For our fine-tuning, we decided to follow a 2-step strategy.
22
- - Pretraining (Fine-tuning) with next token prediction on the previously built gradio dataset (this step should familiarize the model with the gradio syntax.)
23
- - Instruction fine-tuning on an instruction fine-tuning (this step should make the model conversational)
24
- ## Pretraining
25
- Gradio - ready 50 steps
26
- - Fine-tuning
27
- Oasst Guanaco 100 steps
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  # Training setting and hyperparameters
21
  For our fine-tuning, we decided to follow a 2-step strategy.
22
+ - Pretraining (Fine-tuning) with next token prediction on the previously built gradio dataset (this step should familiarize the model with the gradio syntax.).
23
+ - Instruction fine-tuning on an instruction dataset (this step should make the model conversational.).
24
+ For both steps, we made use of parameter-efficient fine-tuning via the library [PEFT](https://github.com/huggingface/peft), more precisely [LoRa](https://arxiv.org/abs/2106.09685). Our
25
+ training script is the famous [starcoder fine-tuning script](https://github.com/bigcode-project/starcoder).
26
+
27
+ ## Resources
28
+ Our training was done of 8 A100 GPUs of 80GB.
29
+
30
+ ## Pretraining
31
+ These are the parameters that we used :
32
+ - learning rate : 5e-4
33
+ - warmup_steps :
34
+ - gradient_accumulation_steps : 4
35
+ - batch_size : 1
36
+ - sequence length : 2048
37
+ - max_steps : 1000
38
+ - warmup_steps : 5
39
+ - weight_decay : 0.05
40
+
41
+ LORA PARAMETERS :
42
+ - r = 16
43
+ - alpha = 32
44
+ - dropout = 0.05
45
+
46
+ We stopped the training before the end and kept the *checkpoint-100* for the second step.
47
+
48
+ ## Fine-tuning
49
+ This step consisted into the instruction fine-tuning of the previous checkpoint. For that purpose, we used a modified version of [openassistant-guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco).
50
+ The template for the instruction fine-tuning was `Question: {question}\n\nAnswer: {answer}`. We used exactly the same parameters we used during the pretraining and we kept the *checkpoint-50*.
51
+
52
+ ## Usage
53
+ The usage is straightforward an very similar to any other instruction fine-tuned model
54
+
55
+ ```python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer
57
+ checkpoint_name="ArmelR/starcoder-gradio-v0"
58
+ model = AutoModelForCausalLM.from_pretrained(checkpoint_name)
59
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint_name)
60
+ prompt = "Create a gradio application that help to convert temperature in celcius into temperature in Fahrenheit"
61
+ inputs = tokenizer(f"Question: {prompt}\n\nAnswer: ", return_tensors="pt")
62
+ outputs = model.generate(inputs["input_ids"], temperature=0.2, top_p=0.95)
63
+ print(tokenizer.decode(outputs))
64
+ ```