|
--- |
|
license: cc-by-nc-3.0 |
|
datasets: |
|
- FredZhang7/toxi-text-3M |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
**I have decided to release all auto-moderation models at once sometime in July. The curated datasets for training these models will be avaliable first.** |
|
|
|
<br> |
|
|
|
Finished training: 6/30/2023 |
|
|
|
Final Train & Validation Accuracy: 95-98% |
|
|
|
Large model (v2) will be avaliable for PyTorch |
|
|
|
Lightweight model and tokenizer (v1) will be avaliable for transformers.js |
|
|
|
<br> |
|
|
|
<br> |
|
|
|
Models tested: roberta, xlm-roberta, bert-tiny, bert-base-cased/uncased, bert-multilingual-cased/uncased, alberta-large-v2 |
|
|
|
Model chosen based on cost-efficiency and performance: bert-multilingual-cased |