File size: 2,960 Bytes
3c6d23b
 
697cd6f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3c6d23b
 
 
 
 
 
 
 
 
 
 
edf8908
 
 
15f687d
3c6d23b
 
 
 
 
 
 
 
 
 
 
 
2286f94
 
 
15f687d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
 
 
 
 
 
 
 
 
3c6d23b
 
171f37b
15f687d
 
 
3c6d23b
2286f94
 
3c6d23b
15f687d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
 
 
 
 
 
 
 
 
3c6d23b
15f687d
 
 
3c6d23b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- f1
- accuracy
model-index:
- name: pretrained_model
  results:
  - task:
      name: Text Classification
      type: text-classification
    metrics:
    - name: F1
      type: f1
      value: 0.6356
    - name: AUC
      type: auc
      value: 0.7643

---
This model is a fine-tuned version of distilbert-base-uncased on Reddit dataset contains text related to mental health reports of users. it predicts mental health disorders from textual content.

It achieves the following results on the validation set:

* Loss: 0.1873
* F1: 0.6356
* AUC: 0.7643
* Precision: 0.7671

# Description
This model is based on an existing lighter variation of BERT (distilBERT), in order to predict different mental disorders. It is trained on a costume dataset of texts or posts (from Reddit) about general experiences of users with mental health problems.
All direct mentions of the disorder names in the texts were removed.     
    
It includes the following classes:   

* Borderline
* Anxiety
* Depression
* Bipolar
* OCD
* ADHD
* Schizophrenia
* Asperger
* PTSD

# Training
Train size: 90%   
Val size: 10%   
   
Training set class counts (text samples) after balancing:   
Borderline       10398   
Anxiety          10393   
Depression       10400   
Bipolar          10359   
OCD              10413   
ADHD             10412   
Schizophrenia    10447   
Asperger         10470   
PTSD             10489   
   
Validation set class counts after balancing:   
Borderline       1180   
Anxiety          1185   
Depression       1178   
Bipolar          1219   
OCD              1165   
ADHD             1166   
Schizophrenia    1131   
Asperger         1108   
PTSD             1089   

The following hyperparameters were used during training:   
   
model-finetuning: distilbert/distilbert-base-uncased   
   
learning_rate: 1e-5   
train_batch_size: 64   
val_batch_size: 64   
weight_decay: 0.01   
optimizer: AdamW   
num_epochs: 2-3   

# Training results
| Epoch | Training Loss | Validation Loss |
|-------|---------------|-----------------|
| 1.0   | 0.2660        | 0.2031          |
| 2.0   | 0.1891        | 0.1872          |

F1 Score: 0.6355   
AUC Score: 0.7642   

## Classification Report
Borderline:   
 Precision: 0.7606   
 Recall: 0.4525   
 F1-score: 0.5674   
  
Anxiety:   
 Precision: 0.7063   
 Recall: 0.5459   
 F1-score: 0.6158   
  
Depression:   
 Precision: 0.7286   
 Recall: 0.4626   
 F1-score: 0.5659   
  
Bipolar:   
 Precision: 0.7997   
 Recall: 0.4487   
 F1-score: 0.5748   
    
OCD:   
 Precision: 0.8222   
 Recall: 0.5957   
 F1-score: 0.6908   
    
ADHD:   
 Precision: 0.8856   
 Recall: 0.5711   
 F1-score: 0.6944   
 
Schizophrenia:   
 Precision: 0.7540   
 Recall: 0.6153   
 F1-score: 0.6777   
    
Asperger:   
 Precision: 0.6743   
 Recall: 0.6335   
 F1-score: 0.6533   
    
PTSD:
 Precision: 0.7724   
 Recall: 0.6235   
 F1-score: 0.6900