mentBERT / README.md
reab5555's picture
Update README.md
3f8ff5e verified
|
raw
history blame
No virus
3.73 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - f1
  - auc
model-index:
  - name: pretrained_model
    results:
      - task:
          name: Text Classification
          type: text-classification
        metrics:
          - name: F1
            type: f1
            value: 0.6356
          - name: AUC
            type: auc
            value: 0.7643
widget:
  - text: >-
      I have trouble understanding what other people think or feel. I also like
      numbers, and finding patterns in numbers.

This model is a hybrid fine-tuned version of distilbert-base-uncased on Reddit dataset contains text related to mental health reports of users. it predicts mental health disorders from textual content.

It achieves the following results on the validation set:

  • Loss: 0.1873
  • F1: 0.6356
  • AUC: 0.7643
  • Precision: 0.7671

Description

This model is based on an existing lighter variation of BERT (distilBERT), in order to predict different mental disorders.

  • It is using combinded features of sentiments and emotions (distilbert-base-uncased-finetuned-sst-2-english and roberta-base-go_emotions).
  • It is trained on a costume dataset of texts or posts (from Reddit) about general experiences of users with mental health problems.
  • All direct mentions of the disorder names in the texts were removed.

It includes the following classes:

  • Borderline
  • Anxiety
  • Depression
  • Bipolar
  • OCD
  • ADHD
  • Schizophrenia
  • Asperger
  • PTSD

Training

Train size: 90%
Val size: 10%

Training set class counts (text samples) after balancing:
Borderline: 10398
Anxiety: 10393
Depression: 10400
Bipolar: 10359
OCD: 10413
ADHD: 10412
Schizophrenia: 10447
Asperger: 10470
PTSD: 10489

Validation set class counts after balancing:
Borderline: 1180
Anxiety: 1185
Depression: 1178
Bipolar: 1219
OCD: 1165
ADHD: 1166
Schizophrenia: 1131
Asperger: 1108
PTSD: 1089

model-finetuning: distilbert/distilbert-base-uncased

additional features (GoEmotions - SamLowe/roberta-base-go_emotions + SST2 - distilbert/distilbert-base-uncased-finetuned-sst-2-english):
negative, positive, admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity,
desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief,
joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, neutral

The following hyperparameters were used during training:

learning_rate: 1e-5
train_batch_size: 64
val_batch_size: 64
weight_decay: 0.01
optimizer: AdamW
num_epochs: 2-3

Training results

Epoch Training Loss Validation Loss
1.0 0.2660 0.2031
2.0 0.1891 0.1872

F1 Score: 0.6355
AUC Score: 0.7642

Classification Report

Borderline:
Precision: 0.7606
Recall: 0.4525
F1-score: 0.5674

Anxiety:
Precision: 0.7063
Recall: 0.5459
F1-score: 0.6158

Depression:
Precision: 0.7286
Recall: 0.4626
F1-score: 0.5659

Bipolar:
Precision: 0.7997
Recall: 0.4487
F1-score: 0.5748

OCD:
Precision: 0.8222
Recall: 0.5957
F1-score: 0.6908

ADHD:
Precision: 0.8856
Recall: 0.5711
F1-score: 0.6944

Schizophrenia:
Precision: 0.7540
Recall: 0.6153
F1-score: 0.6777

Asperger:
Precision: 0.6743
Recall: 0.6335
F1-score: 0.6533

PTSD: Precision: 0.7724
Recall: 0.6235
F1-score: 0.6900