--- license: apache-2.0 base_model: unsloth/Llama-3.2-3B-Instruct tags: - text-to-sql - nl2sql - unsloth - llama - lora - qlora datasets: - spider metrics: - exact_match - similarity model-index: - name: querymind-nl2sql results: [] --- # 🧠 QueryMind: Natural Language to SQL Engine QueryMind is a domain-specific, highly-optimized **NL-to-SQL engine** powered by a fine-tuned **LLaMA 3.2 3B Instruct** model. It has been fine-tuned using **QLoRA (4-bit)** via **Unsloth** on the **Spider NL2SQL dataset** to translate plain English queries into accurate, schema-valid SQL statements based on a provided database schema. --- ## 🎯 Model Details - **Developed by:** Lakshitha Nuwan - **Model type:** Causal Language Model (Fine-tuned LLM) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) - **Training Framework:** Unsloth & PyTorch --- ## 🔗 Model Sources - **HuggingFace Repository:** [lakshitha722/querymind-nl2sql](https://huggingface.co/lakshitha722/querymind-nl2sql) - **Interactive Live Demo:** [HuggingFace Space Demo](https://huggingface.co/spaces/lakshitha722/querymind-nl2sql-demo) - **Experiment Tracking:** [Weights & Biases (W&B) Dashboard](https://wandb.ai/lakshithanuwan722-other/querymind-nl2sql) --- ## 💻 How to Get Started with the Model Use the code below to load the model and generate SQL queries using **Unsloth** (recommended for local GPUs) or standard HuggingFace **Transformers**. ### Inference with Unsloth (Recommended) ```python from unsloth import FastLanguageModel import torch MODEL_NAME = "lakshitha722/querymind-nl2sql" model, tokenizer = FastLanguageModel.from_pretrained( model_name = MODEL_NAME, max_seq_length = 1024, load_in_4bit = True, dtype = None, ) FastLanguageModel.for_inference(model) # 1. Define Prompt Template PROMPT_TEMPLATE = """Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: Convert the following natural language question to a SQL query based on the given database schema. Return ONLY the SQL query, nothing else. ### Schema: {schema} ### Question: {question} ### Response: """ # 2. Prepare Inputs schema = "Database: company\nTables: employees (id, name, department, salary, hire_date)" question = "What is the average salary by department?" prompt = PROMPT_TEMPLATE.format(schema=schema, question=question) inputs = tokenizer([prompt], return_tensors="pt").to("cuda") # 3. Generate with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens = 150, temperature = 0.1, do_sample = False, pad_token_id = tokenizer.eos_token_id, ) # 4. Decode Output input_length = inputs['input_ids'].shape[1] sql = tokenizer.decode(outputs[0][input_length:], skip_special_tokens=True).strip() print("Generated SQL:", sql)