delayedkarma commited on
Commit
159e007
·
verified ·
1 Parent(s): a873a3f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - sft
7
+ - generated_from_trainer
8
+ - peft
9
+ base_model: mistralai/Mistral-7B-v0.1
10
+ datasets:
11
+ - b-mc2/sql-create-context
12
+ model-index:
13
+ - name: mistral-7b-text-to-sql
14
+ results: []
15
+ reference:
16
+ - https://www.philschmid.de/fine-tune-llms-in-2024-with-trl
17
+ language:
18
+ - en
19
+ ---
20
+
21
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
22
+ should probably proofread and complete it, then remove this comment. -->
23
+
24
+ # mistral-7b-text-to-sql_full-model
25
+
26
+ - This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the b-mc2/sql-create-context dataset.
27
+ - These are the full model weights (merged with adapter weights), and the code to use these for generation is given below.
28
+
29
+ ## Model description
30
+
31
+ - Model type: Language model
32
+ - Language(s) (NLP): English
33
+ - License: Apache 2.0
34
+ - Finetuned from model : Mistral-7B-v0.1
35
+
36
+ ## How to get started with the model
37
+
38
+ ```python
39
+ import torch
40
+
41
+ from datasets import load_dataset
42
+ from transformers import AutoTokenizer, AutoModelForCausalLM
43
+
44
+ # Load model directly
45
+
46
+ tokenizer = AutoTokenizer.from_pretrained("delayedkarma/mistral-7b-text-to-sql_full-model")
47
+ model = AutoModelForCausalLM.from_pretrained("delayedkarma/mistral-7b-text-to-sql_full-model")
48
+
49
+ text = "How many matched scored 3–6, 7–6(5), 6–3?"
50
+ inputs = tokenizer(text, return_tensors="pt")
51
+
52
+ outputs = model.generate(**inputs, max_new_tokens=40)
53
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
54
+ ```
55
+
56
+ ## Training procedure
57
+
58
+ ### Training hyperparameters
59
+
60
+ The following hyperparameters were used during training:
61
+ - learning_rate: 0.0002
62
+ - train_batch_size: 3
63
+ - eval_batch_size: 8
64
+ - seed: 42
65
+ - gradient_accumulation_steps: 2
66
+ - total_train_batch_size: 6
67
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
68
+ - lr_scheduler_type: constant
69
+ - lr_scheduler_warmup_ratio: 0.03
70
+ - num_epochs: 3
71
+
72
+ ### Framework versions
73
+
74
+ - PEFT 0.7.2.dev0
75
+ - Transformers 4.36.2
76
+ - Pytorch 2.2.2
77
+ - Datasets 2.16.1
78
+ - Tokenizers 0.15.2