lrei commited on
Commit
6b28ebf
·
verified ·
1 Parent(s): 03e5582

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc0-1.0
3
+ ---
4
+
5
+ This is a [distilroberta-base](distilbert/distilroberta-base) model fined tuned to classify text into 3 categories:
6
+
7
+ - Rare Diseases
8
+ - Non-Rare Diseases
9
+ - Other
10
+
11
+ The details of how this model was built and evaluated are provided in the article:
12
+
13
+ Rei L, Pita Costa J, Zdolšek Draksler T. Automatic Classification and Visualization of Text Data on Rare Diseases. _Journal of Personalized Medicine_. 2024; 14(5):545. https://doi.org/10.3390/jpm14050545
14
+
15
+ ```
16
+ @Article{jpm14050545,
17
+ AUTHOR = {Rei, Luis and Pita Costa, Joao and Zdolšek Draksler, Tanja},
18
+ TITLE = {Automatic Classification and Visualization of Text Data on Rare Diseases},
19
+ JOURNAL = {Journal of Personalized Medicine},
20
+ VOLUME = {14},
21
+ YEAR = {2024},
22
+ NUMBER = {5},
23
+ ARTICLE-NUMBER = {545},
24
+ URL = {https://www.mdpi.com/2075-4426/14/5/545},
25
+ PubMedID = {38793127},
26
+ ISSN = {2075-4426},
27
+ DOI = {10.3390/jpm14050545}
28
+ }
29
+ ```
30
+ Note that the in the article the larger roberta-base model is fine-tuned instead. This is a smaller model. This model is shared for demonstration and validation purposes. Hyper-parameters were not tuned.
31
+
32
+ ## Dataset
33
+
34
+ The dataset used to train this model is available on [zenodo](https://zenodo.org/records/13882003).
35
+ It is a subset of abstracts obtained from PubMed and sorted into the 3 classes on the basis of their MeSH terms.
36
+
37
+ Like the model, the dataset is provided for demonstration and methodology validation purposes. The original PubMed data was randomly under-sampled.
38
+
39
+ ## Code
40
+ The code used to create this model is available on [Github](https://github.com/lrei/rad).
41
+
42
+ ## Test Results
43
+
44
+ Averaged over all 3 classes:
45
+
46
+ | average | precision | recall | F1 |
47
+ | ------- | --------- | ------ | ---- |
48
+ | micro | 0.84 | 0.84 | 0.84 |
49
+ | macro | 0.84 | 0.84 | 0.84 |