mergisi commited on
Commit
35cfefe
·
1 Parent(s): 92ee040

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -19,18 +19,35 @@ AI2sql is a state-of-the-art LLM for converting natural language questions to SQ
19
 
20
  ## Model description
21
 
22
- More information needed
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
27
 
28
  ## Training and evaluation data
29
 
30
- More information needed
31
 
32
  ## Training procedure
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
 
19
 
20
  ## Model description
21
 
22
+ AI2SQL is a specialized LLM fine-tuned from Falcon-7b-instruct with PEFT- LoRA technology, tailored for interpreting natural language and generating corresponding SQL queries.
23
 
24
  ## Intended uses & limitations
25
 
26
+ AI2SQL is designed for data analysts, business intelligence professionals, and developers to facilitate the conversion of natural language questions into SQL queries. This tool aids those who are not proficient in SQL, enabling easier database querying. AI2SQL's performance is inherently tied to the characteristics of its training data. While it has been trained on a diverse and substantial dataset, it may not account for all possible SQL dialects or database structures. Careful review of the generated SQL queries is recommended.
27
 
28
  ## Training and evaluation data
29
 
30
+ Trained on a comprehensive dataset comprising 262,000 rows of paired natural language questions and SQL queries sourced from Text-to-SQL Dataset, covering a wide array of domains and question complexities.
31
 
32
  ## Training procedure
33
 
34
+ To detail the training procedure for AI2SQL, especially considering its specialized task of converting natural language questions to SQL queries and its basis on the Falcon-7b-instruct model, the following section can be included in the model card:
35
+
36
+ ### Overview
37
+ AI2SQL was trained in a multi-stage process, starting with a pre-trained Falcon-7b-instruct model, a large transformer-based language model. This base model was then fine-tuned using a Parameter Efficient Fine-Tuning (PEFT) approach with Locally Reweighted Approximations (LoRA) specifically for the task of translating natural language to SQL queries.
38
+
39
+ ### Data Preparation
40
+ The training dataset, sourced from the [Text-to-SQL Dataset](https://huggingface.co/datasets/Clinton/Text-to-sql-v1), included 262,000 rows of paired natural language questions and SQL queries. Each pair consists of a natural language question and its corresponding SQL query, covering a diverse range of domains and query complexities.
41
+
42
+ ### Fine-Tuning Process
43
+ 1. **Data Preprocessing**: The dataset was preprocessed to normalize text and SQL queries, ensuring consistency in formatting and syntax.
44
+ 2. **Model Adaptation**: The Falcon-7b-instruct model was adapted using PEFT- LoRA, a technique that allows for efficient and targeted updates to the model's weights without extensive retraining. This approach is particularly beneficial for adapting large-scale models to specific tasks with limited computational resources.
45
+ 3. **Training Strategy**: The model was trained in a supervised learning setup, where it learned to map natural language inputs to their corresponding SQL queries. Special attention was given to the model's ability to understand the semantics of the natural language questions and accurately reflect them in SQL syntax.
46
+ 4. **Validation and Testing**: Throughout the training process, the model was periodically evaluated on a held-out validation set to monitor its performance and prevent overfitting. The final model was tested on an independent test set to assess its generalization capabilities.
47
+
48
+ ### Model Evaluation
49
+ The model's performance was evaluated based on its accuracy in generating correct SQL queries corresponding to the input natural language questions. Metrics such as precision, recall, and F1-score were used to quantify the model's effectiveness.
50
+
51
  ### Training hyperparameters
52
 
53
  The following hyperparameters were used during training: