Text Generation
Transformers
Safetensors
mistral
conversational
text-generation-inference
Inference Endpoints

image/png

mistral-nemo-gutenberg3-12B

Mahou-1.5-mistral-nemo-12B-lorablated finetuned on jondurbin/gutenberg-dpo-v0.1, nbeerbower/gutenberg2-dpo, and nbeerbower/gutenberg-moderne-dpo.

Method

ORPO tuned with 8x A100 for 2 epochs.

QLoRA config:

# QLoRA config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch_dtype,
    bnb_4bit_use_double_quant=True,
)
# LoRA config
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
)

Training config:

orpo_args = ORPOConfig(
    run_name=new_model,
    learning_rate=8e-6,
    lr_scheduler_type="linear",
    max_length=4096,
    max_prompt_length=2048,
    max_completion_length=2048,
    beta=0.1,
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    gradient_accumulation_steps=1,
    optim="paged_adamw_8bit",
    num_train_epochs=2,
    evaluation_strategy="steps",
    eval_steps=0.2,
    logging_steps=1,
    warmup_steps=10,
    max_grad_norm=10,
    report_to="wandb",
    output_dir="./results/",
    bf16=True,
    gradient_checkpointing=True,
)
Downloads last month
0
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nbeerbower/mistral-nemo-gutenberg3-12B

Finetuned
(12)
this model
Quantizations
7 models

Datasets used to train nbeerbower/mistral-nemo-gutenberg3-12B