Model Card for Model ID
Model Details
Model Description
- Developed by: The Scamper
- Model type: Transformer
- Language(s) (NLP): Thai, English
- License: apache-2.0
- Finetuned from model: OpenThaiGPT-1.0.0 70B (https://huggingface.co/openthaigpt/openthaigpt-1.0.0-70b-chat)
Uses
The Tubular Question Answering Large Language Model is based on OpenThaiGPT and fine-tuned for converting natural language questions into SQL queries. It learns to map the nuances of Thai language to SQL structures, enabling efficient retrieval of information from databases.
Recommendations
How to Get Started with the Model
Use the code below to get started with the model.
>>> model2_path ="AIAT/The_Scamper-opt70bqt"
>>> tokenizer = AutoTokenizer.from_pretrained(model2_path, padding_side="right",use_fast=False)
>>> model = AutoModelForCausalLM.from_pretrained(model2_path,
device_map="auto")
Training Details
Training Data
Dataset: https://huggingface.co/datasets/AIAT/The_Scamper-train
Training Procedure
The methodology for fine-tuning involves a dataset with three columns: "instruction", "question" and "SQL syntax". Here's a brief outline of the process:
Data Collection: Gather a dataset containing pairs of questions and their corresponding SQL queries. Ensure the questions cover various topics and query types, while the SQL queries represent the desired actions on a database.
Pre-processing: Clean and preprocess the data to remove noise, standardize formatting, and handle any inconsistencies. Tokenize the text and encode it into a format suitable for training.
Model Architecture: Utilize OpenThaiGPT 1.0.0 70B as the base model.
Fine-tuning Setup: Divide the dataset into training (90%) and test sets (10%). We define the training procedure, including hyperparameters such as learning rate, batch size, and number of training epochs.
Fine-tuning Process: Train the model on the question-SQL pairs using the defined setup. During training, the model learns to predict the SQL query corresponding to a given question by minimizing a suitable loss function.
Testing: Evaluate the final model on a held-out test set to assess its generalization performance on unseen data.
By following this methodology, the model can be fine-tuned to accurately convert natural language questions into SQL syntax, enabling seamless interaction with structured databases.
- Downloads last month
- 16