qwen3-1.7B-text2sql / README.md
fahmiaziz's picture
Update README.md
c1da020 verified
---
base_model: unsloth/Qwen3-1.7B
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
license: apache-2.0
language:
- en
---
# Fine-tuning Qwen3-1.7B for Text-to-SQL Task
This project demonstrates the fine-tuning of the `Qwen3-1.7B` language model using a combined and preprocessed dataset for Text-to-SQL generation. The goal is to train the model to generate SQL queries from natural language questions given database schemas.
## Dataset
We used the [fahmiaziz/text2sql-dataset](https://huggingface.co/datasets/fahmiaziz/text2sql-dataset), which merges examples from:
- **Wikisql**
- **Bird**
- **Spider**
- **Synthetic SQL samples**
Before training, the dataset was **cleaned and filtered** by:
- Removing DDL/DML examples (`INSERT`, `UPDATE`, `DELETE`, etc.)
- Deduplicating examples based on **semantic hashing of both SQL and questions**
- Filtering only SELECT-style analytical queries
## Training Format
Since Qwen3 models require a two-part output (`<think>` + final answer), and our dataset does not contain intermediate reasoning, we left the `<think>` section **empty** during fine-tuning.
### Example Format:
```
<|im_start|>system
Given the database schema and the user question, generate the corresponding SQL query.
<|im_end|>
<|im_start|>user
\[SCHEMA]
CREATE TABLE Inclusive\_Housing (Property\_ID INT, Inclusive VARCHAR(10), Property\_Size INT);
INSERT INTO Inclusive\_Housing (Property\_ID, Inclusive, Property\_Size)
VALUES (1, 'Yes', 900), (2, 'No', 1100), (3, 'Yes', 800), (4, 'No', 1200);
\[QUESTION]
What is the average property size in inclusive housing areas?
<|im_end|>
<|im_start|>assistant
<think>
</think>
SELECT AVG(Property\_Size) FROM Inclusive\_Housing WHERE Inclusive = 'Yes';
<|im_end|>
````
## Training Configuration
Due to hardware limitations, **full model training** was not possible. Instead, we applied **LoRA (Low-Rank Adaptation)** with the following configuration:
- **LoRA rank (`r`)**: 128
- **LoRA alpha**: 256
- **Hardware**: Kaggle T4 x2 GPUs
### Training Hyperparameters
```
per_device_train_batch_size = 6,
gradient_accumulation_steps = 2,
warmup_steps = 5,
max_steps = 500,
num_train_epochs = 3,
learning_rate = 1e-4,
fp16 = not is_bf16_supported(),
bf16 = is_bf16_supported(),
logging_steps = 25,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
output_dir = "outputs_v4",
dataset_text_field = "text",
max_seq_length = 1024,
````
## Training Results
```
global_step=500,
training_loss=0.5783241882324218
```
## Evaluation
We evaluated the model using **Exact Match (EM)** score on a manually selected sample of 100 examples. We get score **50%**
---
## Notes
* In future iterations, we plan to:
* Add complex/long context schema
* Full Finetuning
# Uploaded finetuned model
- **Developed by:** fahmiaziz
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Qwen3-1.7B
This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)