Model Description
This model is a fine-tuned version of unsloth/Meta-Llama-3.1-8B
optimized for Text-to-SQL generation tasks. The fine-tuning was done using the Unsloth library with LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning. The training data consists of the first 5000 rows of the Clinton/Text-to-sql-v1 dataset.
- Developed by: Vedant Rajpurohit
- Model type: Causal Language Model
- Language(s): English
- Fine-tuned from model:
unsloth/Meta-Llama-3.1-8B
- Model size: 8.03B parameters
- Precision: BF16
Direct Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load the model and tokenizer from the Hugging Face Hub
model_name = "Vedant3907/Text-to-Sql-llama3.1-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.float16)
model.eval()
# Define your test prompt
sql_prompt = """Below are SQL table schemas paired with instruction that describes a task.
Using valid SQLite, write a response that appropriately completes the request for the provided tables.
### Instruction: What is the 2007 result when the 2010 result was 2r, at the US Open?
### Input: CREATE TABLE table_name_91 ( tournament VARCHAR )
### Response:"""
# Tokenize input
inputs = tokenizer(sql_prompt, return_tensors="pt").to("cuda")
# Generate SQL query
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True, # Use sampling for more diverse outputs
)
# Decode and print the generated output
generated_sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated SQL Query:")
print(generated_sql)
#SELECT 2007 FROM table_name_91 WHERE 2010 = "2r" AND tournament = "us open"
Bias, Risks, and Limitations
- The model was only trained on first 5000 rows for 250 steps.
- The model may generate incorrect or ambiguous SQL queries for instructions that are unclear or outside the training distribution.
Training Details
Dataset
- Dataset Name:
Clinton/Text-to-sql-v1
- Rows Used: First 5000 rows of the dataset.
Training Procedure
The model was fine-tuned using the Unsloth library with LoRA adapters, enabling efficient training. Below are the hyperparameters used:
TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 10, # 4% of 250 steps
max_steps = 250,
learning_rate = 1e-4,
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
logging_steps = 10,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "cosine",
seed = 3407,
output_dir = "outputs",
report_to = "none"
)
Hardware
- Trained on google colab with its T4 GPU
- Downloads last month
- 342
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.