RuBERTConv Toxic Classifier
Model description
Based on rubert-base-cased-conversational model
Intended uses & limitations
How to use
Colab: link
from transformers import pipeline
model_name = "IlyaGusev/rubertconv_toxic_clf"
pipe = pipeline("text-classification", model=model_name, tokenizer=model_name, framework="pt")
text = "Ты придурок из интернета"
pipe([text])
Training data
Datasets:
Augmentations:
- ё -> е
- Remove or add "?" or "!"
- Fix CAPS
- Concatenate toxic and non-toxic texts
- Concatenate two non-toxic texts
- Add toxic words from vocabulary
- Add typos
- Mask toxic words with "*", "@", "$"
Training procedure
TBA
- Downloads last month
- 5,648
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.