Llama-2-7b-hf-IDMGSP
This model is a LoRA adapter of meta-llama/Llama-2-7b-hf on the tum-nlp/IDMGSP dataset. It achieves the following results on the evaluation split:
- Loss: 0.1450
- Accuracy: {'accuracy': 0.9759036144578314}
- F1: {'f1': 0.9758125472411187}
Model description
Model loaded fine-tuned in 4bit quantization mode using LoRA.
Intended uses & limitations
Labels: 0
non-AI generated, 1
AI generated.
For classifying AI generated text. Code to run the inference
import transformers
import torch
import datasets
import numpy as np
import torch
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel, AutoPeftModelForCausalLM, TaskType
import bitsandbytes as bnb
class Model():
def __init__(self, name) -> None:
# Tokenizer
self.tokenizer = transformers.LlamaTokenizer.from_pretrained(self.name)
self.tokenizer.pad_token = self.tokenizer.eos_token
print(f"Tokenizer: {self.tokenizer.eos_token}; Pad {self.tokenizer.pad_token}")
# Model
bnb_config = transformers.BitsAndBytesConfig(
load_in_4bit = True,
bnb_4bit_use_double_quant = True,
bnb_4bit_quant_type = "nf4",
bnb_4bit_compute_dtype = "bfloat16",
)
self.peft_config = LoraConfig(
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=16, lora_dropout=0.05, bias="none"
)
self.model = transformers.LlamaForSequenceClassification.from_pretrained(self.name,
num_labels=2,
quantization_config = bnb_config,
device_map = "auto"
)
self.model.config.pad_token_id = self.model.config.eos_token_id
def predict(self, text):
inputs = self.tokenize(text)
outputs = self.model(**inputs)
logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
return id2label[predictions.item()]
Training and evaluation data
tum-nlp/IDMGSP dataset, classifier_input
subsplit.
Training procedure
Training hyperparameters
BitsAndBytes and LoRA config parameters:
GPU VRAM Consumption during fine-tuning: 30.6gb
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- lr_scheduler_warmup_steps: 500
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
0.0766 | 1.0 | 498 | 0.1165 | {'accuracy': 0.9614708835341366} | {'f1': 0.9612813721780804} |
0.182 | 2.0 | 996 | 0.0934 | {'accuracy': 0.9657379518072289} | {'f1': 0.9648059816939539} |
0.037 | 3.0 | 1494 | 0.1190 | {'accuracy': 0.9716365461847389} | {'f1': 0.9710182097973841} |
0.0349 | 4.0 | 1992 | 0.1884 | {'accuracy': 0.96875} | {'f1': 0.9692326702088224} |
0.0046 | 5.0 | 2490 | 0.1450 | {'accuracy': 0.9759036144578314} | {'f1': 0.9758125472411187} |
Framework versions
- Transformers 4.35.0
- Pytorch 2.0.1
- Datasets 2.14.6
- Tokenizers 0.14.1
- Downloads last month
- 264
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ernlavr/Llama-2-7b-hf-IDMGSP
Base model
meta-llama/Llama-2-7b-hf