Llama-3.1-8B-UA-GEC / README.md
thedanmaks's picture
Update README.md
841aaf7 verified
metadata
base_model: unsloth/llama-3.1-8b-instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
  - grpo
  - grammar-correction
license: apache-2.0
language:
  - en
  - uk

Uploaded model

  • Developed by: Daniil Maksymenko
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.1-8b-instruct-unsloth-bnb-4bit
  • The model is still in development and needs more data and train runs. So be cautious with usage and report bugs in Community section, please. Some bugs like short inputs or difficulties with specific numeric values are already known and I plan to fix them.

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Llama 3.1 8B trained on UA GEC Fluency dataset to fix grammar, style and spelling mistakes in the text. Training was done with GRPO, no SFT/DPO.

TODO: train on short texts and single words / word combinations to avoid hallucinations caused by short inputs. Also, gather more data overall.