chrisvoncsefalvay commited on
Commit
c96adc1
·
verified ·
1 Parent(s): 62559ef

Model save

Browse files
Files changed (2) hide show
  1. README.md +15 -190
  2. model.safetensors +1 -1
README.md CHANGED
@@ -1,207 +1,32 @@
1
  ---
2
- language:
3
- - en
4
- license: apache-2.0
5
- library_name: transformers
6
  tags:
7
- - medical
8
- - pharmacovigilance
9
- - vaccines
10
- datasets:
11
- - chrisvoncsefalvay/vaers-outcomes
12
- metrics:
13
- - accuracy
14
- - f1
15
- - precision
16
- - recall
17
- dataset: chrisvoncsefalvay/vaers-outcomes
18
- pipeline_tag: text-classification
19
- widget:
20
- - text: Patient is a 90 y.o. male with a PMH of IPF, HFpEF, AFib (Eliquis), Metastatic
21
- Prostate Cancer who presented to Hospital 10/28/2023 following an unwitnessed
22
- fall at his assisted living. He was found to have an AKI, pericardial effusion,
23
- hypoxia, AMS, and COVID-19. His hospital course was complicated by delirium and
24
- aspiration, leading to acute hypoxic respiratory failure requiring BiPAP and transfer
25
- to the ICU. Palliative Care had been following, and after goals of care conversations
26
- on 11/10/2023 the patient was transitioned to DNR-CC. Patient expired at 0107
27
- 11/12/23.
28
- example_title: VAERS 2727645 (hospitalisation, death)
29
- - text: 'hospitalized for paralytic ileus a week after the vaccination; This serious
30
- case was reported by a physician via call center representative and described
31
- the occurrence of ileus paralytic in a patient who received Rota (Rotarix liquid
32
- formulation) for prophylaxis. On an unknown date, the patient received the 1st
33
- dose of Rotarix liquid formulation. On an unknown date, less than 2 weeks after
34
- receiving Rotarix liquid formulation, the patient experienced ileus paralytic
35
- (Verbatim: hospitalized for paralytic ileus a week after the vaccination) (serious
36
- criteria hospitalization and GSK medically significant). The outcome of the ileus
37
- paralytic was not reported. It was unknown if the reporter considered the ileus
38
- paralytic to be related to Rotarix liquid formulation. It was unknown if the company
39
- considered the ileus paralytic to be related to Rotarix liquid formulation. Additional
40
- Information: GSK Receipt Date: 27-DEC-2023 Age at vaccination and lot number were
41
- not reported. The patient of unknown age and gender was hospitalized for paralytic
42
- ileus a week after the vaccination. The reporting physician was in charge of the
43
- patient.'
44
- example_title: VAERS 2728408 (hospitalisation)
45
- - text: Patient received Pfizer vaccine 7 days beyond BUD. According to Pfizer manufacturer
46
- research data, vaccine is stable and effective up to 2 days after BUD. Waiting
47
- for more stability data from PFIZER to determine if revaccination is necessary.
48
- example_title: VAERS 2728394 (no event)
49
- - text: Fever of 106F rectally beginning 1 hr after immunizations and lasting <24
50
- hrs. Seen at ER treated w/tylenol & cool baths.
51
- example_title: VAERS 25042 (ER attendance)
52
- - text: I had the MMR shot last week, and I felt a little dizzy afterwards, but it
53
- passed after a few minutes and I'm doing fine now.
54
- example_title: 'Non-sample example: simulated informal patient narrative (no event)'
55
- - text: My niece had the COVID vaccine. A few weeks later, she was T-boned by a drunk
56
- driver. She called me from the ER. She's fully recovered now, though.
57
- example_title: 'Non-sample example: simulated informal patient narrative (ER attendance,
58
- albeit unconnected)'
59
  model-index:
60
  - name: daedra
61
- results:
62
- - task:
63
- type: text-classification
64
- dataset:
65
- name: vaers-outcomes
66
- type: vaers-outcomes
67
- metrics:
68
- - type: accuracy_microaverage
69
- value: 0.885
70
- name: Accuracy, microaveraged
71
- verified: false
72
- - type: f1_microaverage
73
- value: 0.885
74
- name: F1 score, microaveraged
75
- verified: false
76
- - type: precision_macroaverage
77
- value: 0.769
78
- name: Precision, macroaveraged
79
- verified: false
80
- - type: recall_macroaverage
81
- value: 0.688
82
- name: Recall, macroaveraged
83
- verified: false
84
  ---
85
 
86
- # DAEDRA: Determining Adverse Event Disposition for Regulatory Affairs
 
87
 
88
- This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) trained on the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes).
89
 
90
- # Table of Contents
91
 
92
- - [Model Details](#model-details)
93
- - [Uses](#uses)
94
- - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
95
- - [Training Details](#training-details)
96
- - [Evaluation](#evaluation)
97
- - [Environmental Impact](#environmental-impact)
98
- - [Technical Specifications](#technical-specifications-optional)
99
- - [Citation](#citation)
100
 
 
101
 
102
- # Model Details
103
 
104
- ## Model Description
105
 
106
- <!-- Provide a longer summary of what this model is/does. -->
107
 
108
- DAEDRA is a model for the identification of adverse event dispositions (outcomes) from passive pharmacovigilance data.
109
- The model is trained on a real-world adversomics data set spanning over three decades (1990-2023) and comprising over 1.8m records for a total corpus of 173,093,850 words constructed from a subset of reports submitted to VAERS.
110
- It is intended to identify, based on the narrative, whether any, or any combination, of three serious outcomes -- death, hospitalisation and ER attendance -- have occurred.
111
 
112
-
113
- - **Developed by:** Chris von Csefalvay
114
- - **Model type:** Language model
115
- - **Language(s) (NLP):** en
116
- - **License:** apache-2.0
117
- - **Parent Model:** [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2)
118
- - **Resources for more information:**
119
- - [GitHub Repo](https://github.com/chrisvoncsefalvay/daedra)
120
-
121
-
122
- # Uses
123
-
124
- This model was designed to facilitate the coding of passive adverse event reports into severity outcome categories.
125
-
126
- ## Direct Use
127
-
128
- Load the model via the `transformers` library:
129
-
130
- ```
131
- from transformers import AutoTokenizer, AutoModel
132
-
133
- tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/daedra")
134
- model = AutoModel.from_pretrained("chrisvoncsefalvay/daedra")
135
- ```
136
-
137
- ## Out-of-Scope Use
138
-
139
- This model is not intended for the diagnosis or treatment of any disease.
140
-
141
-
142
- # Bias, Risks, and Limitations
143
-
144
- Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
145
-
146
-
147
-
148
- # Training Details
149
-
150
- ## Training Data
151
-
152
- The model was trained on the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes), which comprises 1,814,920 reports from the FDA's Vaccine Adverse Events Reporting System (VAERS). Reports were split into a 70% training set and a 15% test set and 15% validation set after age and gender matching.
153
-
154
- ## Training Procedure
155
-
156
- Training was conducted on an Azure `Standard_NC24s_v3` instance in `us-east`, with 4x Tesla V100-PCIE-16GB GPUs and 24x Intel Xeon E5-2690 v4 CPUs at 2.60GHz.
157
-
158
- ### Speeds, Sizes, Times
159
-
160
- Training took 15 hours and 10 minutes.
161
-
162
-
163
- ## Testing Data, Factors & Metrics
164
-
165
- ### Testing Data
166
-
167
- The model was tested on the `test` partition of the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes).
168
-
169
- ## Results
170
-
171
- On the test set, the model achieved the following results:
172
-
173
- * `f1`: 0.885
174
- * `precision` and `recall`, microaveraged: 0.885
175
- * `precision`, macroaveraged: 0.769
176
- * `recall`, macroaveraged: 0.688
177
-
178
-
179
- # Environmental Impact
180
-
181
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
182
-
183
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
184
-
185
- - **Hardware Type:** 4 x Tesla V100-PCIE-16GB
186
- - **Hours used:** 15.166
187
- - **Cloud Provider:** Azure
188
- - **Compute Region:** us-east
189
- - **Carbon Emitted:** 6.72 kg CO2eq (offset by provider)
190
-
191
-
192
- # Citation
193
-
194
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
195
-
196
- **BibTeX:**
197
-
198
- Forthcoming -- watch this space.
199
-
200
- # Model Card Authors
201
-
202
- <!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. -->
203
-
204
- Chris von Csefalvay
205
 
206
  ### Training hyperparameters
207
 
@@ -212,7 +37,7 @@ The following hyperparameters were used during training:
212
  - seed: 42
213
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
214
  - lr_scheduler_type: linear
215
- - num_epochs: 3
216
 
217
  ### Framework versions
218
 
 
1
  ---
2
+ base_model: dmis-lab/biobert-base-cased-v1.2
 
 
 
3
  tags:
4
+ - generated_from_trainer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  model-index:
6
  - name: daedra
7
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
 
13
+ # daedra
14
 
15
+ This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) on an unknown dataset.
16
 
17
+ ## Model description
 
 
 
 
 
 
 
18
 
19
+ More information needed
20
 
21
+ ## Intended uses & limitations
22
 
23
+ More information needed
24
 
25
+ ## Training and evaluation data
26
 
27
+ More information needed
 
 
28
 
29
+ ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  ### Training hyperparameters
32
 
 
37
  - seed: 42
38
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
39
  - lr_scheduler_type: linear
40
+ - num_epochs: 5
41
 
42
  ### Framework versions
43
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ae113c2100befad66edd984ce17c6688e70d5a916191fc809fb29eec9ccb87b0
3
  size 503957528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65b6416503a81b0a04b01eb2faaee4b0753d3b8ca73c7e5b97709113b300bb2f
3
  size 503957528