File size: 3,700 Bytes
3b02c0e db7a8da 260f34f db7a8da 260f34f 3b02c0e 260f34f 3b02c0e 260f34f 3b02c0e dcf74a9 3b02c0e 2b6faf1 3b02c0e 260f34f 3b02c0e 260f34f 3b02c0e 3d5c479 3b02c0e 260f34f 3b02c0e 260f34f 728fffb 3b02c0e 728fffb d97593e 3b02c0e 4e70995 625c69d 4e70995 3b02c0e 260f34f e9114a0 260f34f e9114a0 260f34f e9114a0 260f34f e9114a0 260f34f 5ee3839 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
base_model: llama-3B
tags:
- text-generation
- sql
- peft
- lora
- rslora
- unsloth
- llama3
- instruction-tuned
license: mit
---
# SQLGenie - LoRA Fine-Tuned LLaMA 3B for Text-to-SQL Generation
**SQLGenie** is a lightweight LoRA adapter fine-tuned on top of Unsloth’s 4-bit LLaMA 3 (3B) model. It is designed to convert natural language instructions into valid SQL queries with minimal compute overhead, making it ideal for integrating into data-driven applications,or chat interfaces.
it has been trained over 100K types of text based on various different domains such as Education, Technical, Health and more
## Model Highlights
- **Base model**: `Llama3 3B`
- **Tokenizer**: Compatible with `Llama3 3B`
- **Fine tuned for**: Text to SQL Converter
- **Accuracy**: > 85%
- **Language**: English Natural Language Sentences fientuned
- **Format**: `safetensors`
## Model Dependencies
- **Python Version**: `3.10`
- **libraries**: `unsloth`
- pip install unsloth
### Model Description
- **Developed by:** Merwin
- **Model type:** PEFT adapter (LoRA) for Causal Language Modeling
- **Language(s):** English
- **Fine-tuned from model:** [unsloth/llama-3.2-3b-unsloth-bnb-4bit](https://huggingface.co/unsloth/llama-3.2-3b-unsloth-bnb-4bit)
### Model Sources
- **Repository:** https://huggingface.co/mervp/SQLGenie
## Uses
### Direct Use
This model can be directly used to generate SQL queries from natural language prompts. Example use cases include:
- Building AI assistants for databases
- Enhancing Query tools with NL-to-SQL capabilities
- Automating analytics queries in various domains
## Bias, Risks, and Limitations
While the model has been fine-tuned for SQL generation, it may:
- Produce invalid SQL for a very few edge cases
- Infer incorrect table or column names not present in prompt
- Assume a generic SQL dialect (closer to MySQL/PostgreSQL Databases)
### Recommendations
- Always validate and test generated queries before execution in a production database.
Thanks for visiting and downloading this model!
If this model helped you, please consider leaving a 👍 like. Your support helps this model reach more developers and encourages further improvements if any.
---
## How to Get Started with the Model
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="mervp/SQLGenie",
max_seq_length=2048,
dtype=None,
)
prompt = """ You are an text to SQL query translator.
Users will ask you questions in English
and you will generate a SQL query based on their question
SQL has to be simple, The schema context has been provided to you.
### User Question:
{}
### Sql Context:
{}
### Sql Query:
{}
"""
question = "List the names of customers who have an account balance greater than 6000."
schema = """
CREATE TABLE socially_responsible_lending (
customer_id INT,
name VARCHAR(50),
account_balance DECIMAL(10, 2)
);
INSERT INTO socially_responsible_lending VALUES
(1, 'james Chad', 5000),
(2, 'Jane Rajesh', 7000),
(3, 'Alia Kapoor', 6000),
(4, 'Fatima Patil', 8000);
"""
inputs = tokenizer(
[prompt.format(question, schema, "")],
return_tensors="pt",
padding=True,
truncation=True
).to("cuda")
output = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.2,
top_p=0.9,
top_k=50,
do_sample=True
)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
if "### Sql Query:" in decoded_output:
sql_query = decoded_output.split("### Sql Query:")[-1].strip()
else:
sql_query = decoded_output.strip()
print(sql_query)
|