chrisvoncsefalvay
commited on
Model save
Browse files- README.md +15 -190
- model.safetensors +1 -1
README.md
CHANGED
@@ -1,207 +1,32 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
- en
|
4 |
-
license: apache-2.0
|
5 |
-
library_name: transformers
|
6 |
tags:
|
7 |
-
-
|
8 |
-
- pharmacovigilance
|
9 |
-
- vaccines
|
10 |
-
datasets:
|
11 |
-
- chrisvoncsefalvay/vaers-outcomes
|
12 |
-
metrics:
|
13 |
-
- accuracy
|
14 |
-
- f1
|
15 |
-
- precision
|
16 |
-
- recall
|
17 |
-
dataset: chrisvoncsefalvay/vaers-outcomes
|
18 |
-
pipeline_tag: text-classification
|
19 |
-
widget:
|
20 |
-
- text: Patient is a 90 y.o. male with a PMH of IPF, HFpEF, AFib (Eliquis), Metastatic
|
21 |
-
Prostate Cancer who presented to Hospital 10/28/2023 following an unwitnessed
|
22 |
-
fall at his assisted living. He was found to have an AKI, pericardial effusion,
|
23 |
-
hypoxia, AMS, and COVID-19. His hospital course was complicated by delirium and
|
24 |
-
aspiration, leading to acute hypoxic respiratory failure requiring BiPAP and transfer
|
25 |
-
to the ICU. Palliative Care had been following, and after goals of care conversations
|
26 |
-
on 11/10/2023 the patient was transitioned to DNR-CC. Patient expired at 0107
|
27 |
-
11/12/23.
|
28 |
-
example_title: VAERS 2727645 (hospitalisation, death)
|
29 |
-
- text: 'hospitalized for paralytic ileus a week after the vaccination; This serious
|
30 |
-
case was reported by a physician via call center representative and described
|
31 |
-
the occurrence of ileus paralytic in a patient who received Rota (Rotarix liquid
|
32 |
-
formulation) for prophylaxis. On an unknown date, the patient received the 1st
|
33 |
-
dose of Rotarix liquid formulation. On an unknown date, less than 2 weeks after
|
34 |
-
receiving Rotarix liquid formulation, the patient experienced ileus paralytic
|
35 |
-
(Verbatim: hospitalized for paralytic ileus a week after the vaccination) (serious
|
36 |
-
criteria hospitalization and GSK medically significant). The outcome of the ileus
|
37 |
-
paralytic was not reported. It was unknown if the reporter considered the ileus
|
38 |
-
paralytic to be related to Rotarix liquid formulation. It was unknown if the company
|
39 |
-
considered the ileus paralytic to be related to Rotarix liquid formulation. Additional
|
40 |
-
Information: GSK Receipt Date: 27-DEC-2023 Age at vaccination and lot number were
|
41 |
-
not reported. The patient of unknown age and gender was hospitalized for paralytic
|
42 |
-
ileus a week after the vaccination. The reporting physician was in charge of the
|
43 |
-
patient.'
|
44 |
-
example_title: VAERS 2728408 (hospitalisation)
|
45 |
-
- text: Patient received Pfizer vaccine 7 days beyond BUD. According to Pfizer manufacturer
|
46 |
-
research data, vaccine is stable and effective up to 2 days after BUD. Waiting
|
47 |
-
for more stability data from PFIZER to determine if revaccination is necessary.
|
48 |
-
example_title: VAERS 2728394 (no event)
|
49 |
-
- text: Fever of 106F rectally beginning 1 hr after immunizations and lasting <24
|
50 |
-
hrs. Seen at ER treated w/tylenol & cool baths.
|
51 |
-
example_title: VAERS 25042 (ER attendance)
|
52 |
-
- text: I had the MMR shot last week, and I felt a little dizzy afterwards, but it
|
53 |
-
passed after a few minutes and I'm doing fine now.
|
54 |
-
example_title: 'Non-sample example: simulated informal patient narrative (no event)'
|
55 |
-
- text: My niece had the COVID vaccine. A few weeks later, she was T-boned by a drunk
|
56 |
-
driver. She called me from the ER. She's fully recovered now, though.
|
57 |
-
example_title: 'Non-sample example: simulated informal patient narrative (ER attendance,
|
58 |
-
albeit unconnected)'
|
59 |
model-index:
|
60 |
- name: daedra
|
61 |
-
results:
|
62 |
-
- task:
|
63 |
-
type: text-classification
|
64 |
-
dataset:
|
65 |
-
name: vaers-outcomes
|
66 |
-
type: vaers-outcomes
|
67 |
-
metrics:
|
68 |
-
- type: accuracy_microaverage
|
69 |
-
value: 0.885
|
70 |
-
name: Accuracy, microaveraged
|
71 |
-
verified: false
|
72 |
-
- type: f1_microaverage
|
73 |
-
value: 0.885
|
74 |
-
name: F1 score, microaveraged
|
75 |
-
verified: false
|
76 |
-
- type: precision_macroaverage
|
77 |
-
value: 0.769
|
78 |
-
name: Precision, macroaveraged
|
79 |
-
verified: false
|
80 |
-
- type: recall_macroaverage
|
81 |
-
value: 0.688
|
82 |
-
name: Recall, macroaveraged
|
83 |
-
verified: false
|
84 |
---
|
85 |
|
86 |
-
|
|
|
87 |
|
88 |
-
|
89 |
|
90 |
-
|
91 |
|
92 |
-
|
93 |
-
- [Uses](#uses)
|
94 |
-
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
95 |
-
- [Training Details](#training-details)
|
96 |
-
- [Evaluation](#evaluation)
|
97 |
-
- [Environmental Impact](#environmental-impact)
|
98 |
-
- [Technical Specifications](#technical-specifications-optional)
|
99 |
-
- [Citation](#citation)
|
100 |
|
|
|
101 |
|
102 |
-
|
103 |
|
104 |
-
|
105 |
|
106 |
-
|
107 |
|
108 |
-
|
109 |
-
The model is trained on a real-world adversomics data set spanning over three decades (1990-2023) and comprising over 1.8m records for a total corpus of 173,093,850 words constructed from a subset of reports submitted to VAERS.
|
110 |
-
It is intended to identify, based on the narrative, whether any, or any combination, of three serious outcomes -- death, hospitalisation and ER attendance -- have occurred.
|
111 |
|
112 |
-
|
113 |
-
- **Developed by:** Chris von Csefalvay
|
114 |
-
- **Model type:** Language model
|
115 |
-
- **Language(s) (NLP):** en
|
116 |
-
- **License:** apache-2.0
|
117 |
-
- **Parent Model:** [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2)
|
118 |
-
- **Resources for more information:**
|
119 |
-
- [GitHub Repo](https://github.com/chrisvoncsefalvay/daedra)
|
120 |
-
|
121 |
-
|
122 |
-
# Uses
|
123 |
-
|
124 |
-
This model was designed to facilitate the coding of passive adverse event reports into severity outcome categories.
|
125 |
-
|
126 |
-
## Direct Use
|
127 |
-
|
128 |
-
Load the model via the `transformers` library:
|
129 |
-
|
130 |
-
```
|
131 |
-
from transformers import AutoTokenizer, AutoModel
|
132 |
-
|
133 |
-
tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/daedra")
|
134 |
-
model = AutoModel.from_pretrained("chrisvoncsefalvay/daedra")
|
135 |
-
```
|
136 |
-
|
137 |
-
## Out-of-Scope Use
|
138 |
-
|
139 |
-
This model is not intended for the diagnosis or treatment of any disease.
|
140 |
-
|
141 |
-
|
142 |
-
# Bias, Risks, and Limitations
|
143 |
-
|
144 |
-
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
|
145 |
-
|
146 |
-
|
147 |
-
|
148 |
-
# Training Details
|
149 |
-
|
150 |
-
## Training Data
|
151 |
-
|
152 |
-
The model was trained on the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes), which comprises 1,814,920 reports from the FDA's Vaccine Adverse Events Reporting System (VAERS). Reports were split into a 70% training set and a 15% test set and 15% validation set after age and gender matching.
|
153 |
-
|
154 |
-
## Training Procedure
|
155 |
-
|
156 |
-
Training was conducted on an Azure `Standard_NC24s_v3` instance in `us-east`, with 4x Tesla V100-PCIE-16GB GPUs and 24x Intel Xeon E5-2690 v4 CPUs at 2.60GHz.
|
157 |
-
|
158 |
-
### Speeds, Sizes, Times
|
159 |
-
|
160 |
-
Training took 15 hours and 10 minutes.
|
161 |
-
|
162 |
-
|
163 |
-
## Testing Data, Factors & Metrics
|
164 |
-
|
165 |
-
### Testing Data
|
166 |
-
|
167 |
-
The model was tested on the `test` partition of the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes).
|
168 |
-
|
169 |
-
## Results
|
170 |
-
|
171 |
-
On the test set, the model achieved the following results:
|
172 |
-
|
173 |
-
* `f1`: 0.885
|
174 |
-
* `precision` and `recall`, microaveraged: 0.885
|
175 |
-
* `precision`, macroaveraged: 0.769
|
176 |
-
* `recall`, macroaveraged: 0.688
|
177 |
-
|
178 |
-
|
179 |
-
# Environmental Impact
|
180 |
-
|
181 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
182 |
-
|
183 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
184 |
-
|
185 |
-
- **Hardware Type:** 4 x Tesla V100-PCIE-16GB
|
186 |
-
- **Hours used:** 15.166
|
187 |
-
- **Cloud Provider:** Azure
|
188 |
-
- **Compute Region:** us-east
|
189 |
-
- **Carbon Emitted:** 6.72 kg CO2eq (offset by provider)
|
190 |
-
|
191 |
-
|
192 |
-
# Citation
|
193 |
-
|
194 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
195 |
-
|
196 |
-
**BibTeX:**
|
197 |
-
|
198 |
-
Forthcoming -- watch this space.
|
199 |
-
|
200 |
-
# Model Card Authors
|
201 |
-
|
202 |
-
<!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. -->
|
203 |
-
|
204 |
-
Chris von Csefalvay
|
205 |
|
206 |
### Training hyperparameters
|
207 |
|
@@ -212,7 +37,7 @@ The following hyperparameters were used during training:
|
|
212 |
- seed: 42
|
213 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
214 |
- lr_scheduler_type: linear
|
215 |
-
- num_epochs:
|
216 |
|
217 |
### Framework versions
|
218 |
|
|
|
1 |
---
|
2 |
+
base_model: dmis-lab/biobert-base-cased-v1.2
|
|
|
|
|
|
|
3 |
tags:
|
4 |
+
- generated_from_trainer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
model-index:
|
6 |
- name: daedra
|
7 |
+
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
|
10 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
11 |
+
should probably proofread and complete it, then remove this comment. -->
|
12 |
|
13 |
+
# daedra
|
14 |
|
15 |
+
This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) on an unknown dataset.
|
16 |
|
17 |
+
## Model description
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
+
More information needed
|
20 |
|
21 |
+
## Intended uses & limitations
|
22 |
|
23 |
+
More information needed
|
24 |
|
25 |
+
## Training and evaluation data
|
26 |
|
27 |
+
More information needed
|
|
|
|
|
28 |
|
29 |
+
## Training procedure
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
### Training hyperparameters
|
32 |
|
|
|
37 |
- seed: 42
|
38 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
39 |
- lr_scheduler_type: linear
|
40 |
+
- num_epochs: 5
|
41 |
|
42 |
### Framework versions
|
43 |
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 503957528
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:65b6416503a81b0a04b01eb2faaee4b0753d3b8ca73c7e5b97709113b300bb2f
|
3 |
size 503957528
|