token-classification-ai-fine-tune

Hugging Face

This model is a fine-tuned version of bert-base-uncased on the CoNLL-2003 dataset. It achieves a validation loss of 0.0474 on the evaluation set.

Model Description

This is a token classification model fine-tuned for Named Entity Recognition (NER), built on the bert-base-uncased architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by bniladridas, it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the GitHub repository.

Intended Uses & Limitations

Intended Uses

  • Extracting named entities from unstructured text (e.g., news articles, reports)
  • Powering NLP pipelines on CPU-based systems
  • Research or lightweight production use

Limitations

  • Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
  • Uses bert-base-uncased tokenization (lowercase-only), potentially missing case-sensitive nuances
  • Optimized for NER; additional tuning needed for other token-classification tasks

Training and Evaluation Data

The model was trained and evaluated on the CoNLL-2003 dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training Results

Training Loss Epoch Step Validation Loss
0.048 1.0 1756 0.0531
0.0251 2.0 3512 0.0473
0.016 3.0 5268 0.0474

Framework Versions

  • Transformers: 4.28.1
  • PyTorch: 2.0.1
  • Datasets: 1.18.3
  • Tokenizers: 0.13.3

Additional Notes

This version is optimized for CPU use with these intentional adjustments:

  1. Full-precision training: Swapped out fp16 for broader compatibility
  2. Streamlined batch sizes: Set to 8 for efficient CPU processing
  3. Simplified workflow: Skipped gradient accumulation for smoother CPU runs
  4. Full feature set: Retained all monitoring (e.g., TensorBoard) and saving capabilities

For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the GitHub repository. To clone it, run:

git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git

This model was pushed to the Hugging Face Hub for easy CPU-based deployment.

Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for bniladridas/token-classification-ai-fine-tune

Finetuned
(3350)
this model

Dataset used to train bniladridas/token-classification-ai-fine-tune

Evaluation results