File size: 5,264 Bytes

ba1f181
4bbf5e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ba1f181
 
 
3766971
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
ba1f181
4bbf5e1
 
 
d85f7c1
4bbf5e1
 
ba1f181
4bbf5e1
 
ba1f181
4bbf5e1
4ffb107
4bbf5e1
4ffb107
 
 
 
 
 
 
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
 
 
 
 
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
 
 
 
 
 
 
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
ba1f181
4bbf5e1
 
 
 
 
d85f7c1
4bbf5e1
 
ba1f181

---
language: en
tags:
- text-classification
- pytorch
- ModernBERT
- bias
- multi-class-classification
- multi-label-classification
datasets:
- synthetic-biased-corpus
license: mit
metrics:
- accuracy
- f1
- precision
- recall
- matthews_correlation
base_model:
- answerdotai/ModernBERT-large
widget:
- text: Women are bad at math.
library_name: transformers
---

![banner](https://huggingface.co/cirimus/modernbert-large-bias-type-classifier/resolve/main/banner.png)

### Overview

This model was fine-tuned from [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on a synthetic dataset of biased statements and questions, generated by Mistal 7B as part of the [GUS-Net paper](https://huggingface.co/papers/2410.08388). The model is designed to identify and classify text bias into multiple categories, including racial, religious, gender, age, and other biases, making it a valuable tool for bias detection and mitigation in natural language processing tasks.

---

### Model Details

- **Base Model**: [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)
- **Fine-Tuning Dataset**: Synthetic biased corpus
- **Number of Labels**: 11
- **Problem Type**: Multi-label classification
- **Language**: English
- **License**: [MIT](https://opensource.org/licenses/MIT)
- **Fine-Tuning Framework**: Hugging Face Transformers

---

### Example Usage

Here’s how to use the model with Hugging Face Transformers:

```python
from transformers import pipeline

# Load the model
classifier = pipeline(
    "text-classification",
    model="cirimus/modernbert-large-bias-type-classifier",
    return_all_scores=True
)

text = "Tall people are so clumsy."
predictions = classifier(text)

# Print predictions
for pred in sorted(predictions[0], key=lambda x: x['score'], reverse=True)[:5]:
    print(f"{pred['label']}: {pred['score']:.3f}")

# Output:
# physical: 1.000
# socioeconomic: 0.002
# gender: 0.002
# racial: 0.001
# age: 0.001
```

---

### How the Model Was Created

The model was fine-tuned for bias detection using the following hyperparameters:

- **Learning Rate**: `3e-5`
- **Batch Size**: 16
- **Weight Decay**: `0.01`
- **Warmup Steps**: 500
- **Optimizer**: AdamW
- **Evaluation Metrics**: Precision, Recall, F1 Score (weighted), Accuracy

---

### Dataset

The synthetic dataset consists of biased statements and questions generated by Mistal 7B as part of the GUS-Net paper. It covers 11 bias categories:

1. Racial
2. Religious
3. Gender
4. Age
5. Nationality
6. Sexuality
7. Socioeconomic
8. Educational
9. Disability
10. Political
11. Physical

---

### Evaluation Results

The model was evaluated on the synthetic dataset’s test split. The overall metrics using a threshold of `0.5` are as follows:

#### Macro Averages:

| Metric       | Value  |
|--------------|--------|
| Accuracy     | 0.983  |
| Precision    | 0.930  |
| Recall       | 0.914  |
| F1           | 0.921  |
| MCC          | 0.912  |

#### Per-Label Results:

| Label          | Accuracy | Precision | Recall | F1    | MCC   | Support | Threshold |
|----------------|----------|-----------|--------|-------|-------|---------|-----------|
| Racial         | 0.975    | 0.871     | 0.889  | 0.880 | 0.866 | 388     | 0.5       |
| Religious      | 0.994    | 0.962     | 0.970  | 0.966 | 0.962 | 335     | 0.5       |
| Gender         | 0.976    | 0.930     | 0.925  | 0.927 | 0.913 | 615     | 0.5       |
| Age            | 0.990    | 0.964     | 0.931  | 0.947 | 0.941 | 375     | 0.5       |
| Nationality    | 0.972    | 0.924     | 0.881  | 0.902 | 0.886 | 554     | 0.5       |
| Sexuality      | 0.993    | 0.960     | 0.957  | 0.958 | 0.955 | 301     | 0.5       |
| Socioeconomic  | 0.964    | 0.909     | 0.818  | 0.861 | 0.842 | 516     | 0.5       |
| Educational    | 0.982    | 0.873     | 0.933  | 0.902 | 0.893 | 330     | 0.5       |
| Disability     | 0.986    | 0.923     | 0.887  | 0.905 | 0.897 | 283     | 0.5       |
| Political      | 0.988    | 0.958     | 0.938  | 0.948 | 0.941 | 438     | 0.5       |
| Physical       | 0.993    | 0.961     | 0.920  | 0.940 | 0.936 | 238     | 0.5       |

---

### Intended Use

The model is designed to detect and classify bias in text across 11 categories. It can be used in applications such as:

- Content moderation
- Bias analysis in research
- Ethical AI development

---

### Limitations and Biases

- **Synthetic Nature**: The dataset consists of synthetic text, which may not fully represent real-world biases.
- **Category Overlap**: Certain biases may overlap, leading to challenges in precise classification.
- **Domain-Specific Generalization**: The model may not generalize well to domains outside the synthetic dataset’s scope.

---

### Environmental Impact

- **Hardware Used**: NVIDIA RTX4090
- **Training Time**: ~2 hours
- **Carbon Emissions**: ~0.08 kg CO2 (calculated via [ML CO2 Impact Calculator](https://mlco2.github.io/impact)).

---

### Citation

If you use this model, please cite it as follows:

```bibtex
@inproceedings{YourCitation,
  title = {Bias Detection with ModernBERT-Large},
  author = {Enric Junqué de Fortuny},
  year = {2025},
  howpublished = {\url{https://huggingface.co/cirimus/modernbert-large-bias-type-classifier}},
}
```