README.md · FredZhang7/one-for-all-toxicity-v3 at bee5fba30529db1ba60644cdab5d0ceab8eda396

metadata

license: cc-by-nc-3.0
datasets:
  - FredZhang7/toxi-text-3M
pipeline_tag: text-classification

I have decided to release all auto-moderation models at once sometime in July. The curated datasets for training these models will be avaliable first.

Finished training: 6/30/2023

Final Train & Validation Accuracy: 95-98%

Large model (v2) will be avaliable for PyTorch

Lightweight model and tokenizer (v1) will be avaliable for transformers.js

Models tested: roberta, xlm-roberta, bert-tiny, bert-base-cased/uncased, bert-multilingual-cased/uncased, alberta-large-v2

Model chosen based on cost-efficiency and performance: bert-multilingual-cased