Update README.md
Browse files
README.md
CHANGED
@@ -19,18 +19,35 @@ AI2sql is a state-of-the-art LLM for converting natural language questions to SQ
|
|
19 |
|
20 |
## Model description
|
21 |
|
22 |
-
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
-
|
27 |
|
28 |
## Training and evaluation data
|
29 |
|
30 |
-
|
31 |
|
32 |
## Training procedure
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
### Training hyperparameters
|
35 |
|
36 |
The following hyperparameters were used during training:
|
|
|
19 |
|
20 |
## Model description
|
21 |
|
22 |
+
AI2SQL is a specialized LLM fine-tuned from Falcon-7b-instruct with PEFT- LoRA technology, tailored for interpreting natural language and generating corresponding SQL queries.
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|
26 |
+
AI2SQL is designed for data analysts, business intelligence professionals, and developers to facilitate the conversion of natural language questions into SQL queries. This tool aids those who are not proficient in SQL, enabling easier database querying. AI2SQL's performance is inherently tied to the characteristics of its training data. While it has been trained on a diverse and substantial dataset, it may not account for all possible SQL dialects or database structures. Careful review of the generated SQL queries is recommended.
|
27 |
|
28 |
## Training and evaluation data
|
29 |
|
30 |
+
Trained on a comprehensive dataset comprising 262,000 rows of paired natural language questions and SQL queries sourced from Text-to-SQL Dataset, covering a wide array of domains and question complexities.
|
31 |
|
32 |
## Training procedure
|
33 |
|
34 |
+
To detail the training procedure for AI2SQL, especially considering its specialized task of converting natural language questions to SQL queries and its basis on the Falcon-7b-instruct model, the following section can be included in the model card:
|
35 |
+
|
36 |
+
### Overview
|
37 |
+
AI2SQL was trained in a multi-stage process, starting with a pre-trained Falcon-7b-instruct model, a large transformer-based language model. This base model was then fine-tuned using a Parameter Efficient Fine-Tuning (PEFT) approach with Locally Reweighted Approximations (LoRA) specifically for the task of translating natural language to SQL queries.
|
38 |
+
|
39 |
+
### Data Preparation
|
40 |
+
The training dataset, sourced from the [Text-to-SQL Dataset](https://huggingface.co/datasets/Clinton/Text-to-sql-v1), included 262,000 rows of paired natural language questions and SQL queries. Each pair consists of a natural language question and its corresponding SQL query, covering a diverse range of domains and query complexities.
|
41 |
+
|
42 |
+
### Fine-Tuning Process
|
43 |
+
1. **Data Preprocessing**: The dataset was preprocessed to normalize text and SQL queries, ensuring consistency in formatting and syntax.
|
44 |
+
2. **Model Adaptation**: The Falcon-7b-instruct model was adapted using PEFT- LoRA, a technique that allows for efficient and targeted updates to the model's weights without extensive retraining. This approach is particularly beneficial for adapting large-scale models to specific tasks with limited computational resources.
|
45 |
+
3. **Training Strategy**: The model was trained in a supervised learning setup, where it learned to map natural language inputs to their corresponding SQL queries. Special attention was given to the model's ability to understand the semantics of the natural language questions and accurately reflect them in SQL syntax.
|
46 |
+
4. **Validation and Testing**: Throughout the training process, the model was periodically evaluated on a held-out validation set to monitor its performance and prevent overfitting. The final model was tested on an independent test set to assess its generalization capabilities.
|
47 |
+
|
48 |
+
### Model Evaluation
|
49 |
+
The model's performance was evaluated based on its accuracy in generating correct SQL queries corresponding to the input natural language questions. Metrics such as precision, recall, and F1-score were used to quantify the model's effectiveness.
|
50 |
+
|
51 |
### Training hyperparameters
|
52 |
|
53 |
The following hyperparameters were used during training:
|