daedra / README.md
chrisvoncsefalvay's picture
Upload tokenizer
37859f7 verified
|
raw
history blame
8.78 kB
metadata
language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - medical
  - pharmacovigilance
  - vaccines
datasets:
  - chrisvoncsefalvay/vaers-outcomes
metrics:
  - accuracy
  - f1
  - precision
  - recall
dataset: chrisvoncsefalvay/vaers-outcomes
pipeline_tag: text-classification
widget:
  - text: >-
      Patient is a 90 y.o. male with a PMH of IPF, HFpEF, AFib (Eliquis),
      Metastatic Prostate Cancer who presented to Hospital 10/28/2023 following
      an unwitnessed fall at his assisted living. He was found to have an AKI,
      pericardial effusion, hypoxia, AMS, and COVID-19. His hospital course was
      complicated by delirium and aspiration, leading to acute hypoxic
      respiratory failure requiring BiPAP and transfer to the ICU. Palliative
      Care had been following, and after goals of care conversations on
      11/10/2023 the patient was transitioned to DNR-CC. Patient expired at 0107
      11/12/23.
    example_title: VAERS 2727645 (hospitalisation, death)
  - text: >-
      hospitalized for paralytic ileus a week after the vaccination; This
      serious case was reported by a physician via call center representative
      and described the occurrence of ileus paralytic in a patient who received
      Rota (Rotarix liquid formulation) for prophylaxis. On an unknown date, the
      patient received the 1st dose of Rotarix liquid formulation. On an unknown
      date, less than 2 weeks after receiving Rotarix liquid formulation, the
      patient experienced ileus paralytic (Verbatim: hospitalized for paralytic
      ileus a week after the vaccination) (serious criteria hospitalization and
      GSK medically significant). The outcome of the ileus paralytic was not
      reported. It was unknown if the reporter considered the ileus paralytic to
      be related to Rotarix liquid formulation. It was unknown if the company
      considered the ileus paralytic to be related to Rotarix liquid
      formulation. Additional Information: GSK Receipt Date: 27-DEC-2023 Age at
      vaccination and lot number were not reported. The patient of unknown age
      and gender was hospitalized for paralytic ileus a week after the
      vaccination. The reporting physician was in charge of the patient.
    example_title: VAERS 2728408 (hospitalisation)
  - text: >-
      Patient received Pfizer vaccine 7 days beyond BUD. According to Pfizer
      manufacturer research data, vaccine is stable and effective up to 2 days
      after BUD. Waiting for more stability data from PFIZER to determine if
      revaccination is necessary.
    example_title: VAERS 2728394 (no event)
  - text: >-
      Fever of 106F rectally beginning 1 hr after immunizations and lasting <24
      hrs. Seen at ER treated w/tylenol & cool baths.
    example_title: VAERS 25042 (ER attendance)
  - text: >-
      I had the MMR shot last week, and I felt a little dizzy afterwards, but it
      passed after a few minutes and I'm doing fine now.
    example_title: 'Non-sample example: simulated informal patient narrative (no event)'
  - text: >-
      My niece had the COVID vaccine. A few weeks later, she was T-boned by a
      drunk driver. She called me from the ER. She's fully recovered now,
      though.
    example_title: >-
      Non-sample example: simulated informal patient narrative (ER attendance,
      albeit unconnected)
model-index:
  - name: daedra
    results:
      - task:
          type: text-classification
        dataset:
          name: vaers-outcomes
          type: vaers-outcomes
        metrics:
          - type: accuracy_microaverage
            value: 0.885
            name: Accuracy, microaveraged
            verified: false
          - type: f1_microaverage
            value: 0.885
            name: F1 score, microaveraged
            verified: false
          - type: precision_macroaverage
            value: 0.769
            name: Precision, macroaveraged
            verified: false
          - type: recall_macroaverage
            value: 0.688
            name: Recall, macroaveraged
            verified: false

DAEDRA: Determining Adverse Event Disposition for Regulatory Affairs

This model is a fine-tuned version of dmis-lab/biobert-base-cased-v1.2 trained on the VAERS adversome outcomes data set.

Table of Contents

Model Details

Model Description

DAEDRA is a model for the identification of adverse event dispositions (outcomes) from passive pharmacovigilance data. The model is trained on a real-world adversomics data set spanning over three decades (1990-2023) and comprising over 1.8m records for a total corpus of 173,093,850 words constructed from a subset of reports submitted to VAERS. It is intended to identify, based on the narrative, whether any, or any combination, of three serious outcomes -- death, hospitalisation and ER attendance -- have occurred.

Uses

This model was designed to facilitate the coding of passive adverse event reports into severity outcome categories.

Direct Use

Load the model via the transformers library:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/daedra")
model = AutoModel.from_pretrained("chrisvoncsefalvay/daedra")

Out-of-Scope Use

This model is not intended for the diagnosis or treatment of any disease.

Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.

Training Details

Training Data

The model was trained on the VAERS adversome outcomes data set, which comprises 1,814,920 reports from the FDA's Vaccine Adverse Events Reporting System (VAERS). Reports were split into a 70% training set and a 15% test set and 15% validation set after age and gender matching.

Training Procedure

Training was conducted on an Azure Standard_NC24s_v3 instance in us-east, with 4x Tesla V100-PCIE-16GB GPUs and 24x Intel Xeon E5-2690 v4 CPUs at 2.60GHz.

Speeds, Sizes, Times

Training took 15 hours and 10 minutes.

Testing Data, Factors & Metrics

Testing Data

The model was tested on the test partition of the VAERS adversome outcomes data set.

Results

On the test set, the model achieved the following results:

  • f1: 0.885
  • precision and recall, microaveraged: 0.885
  • precision, macroaveraged: 0.769
  • recall, macroaveraged: 0.688

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: 4 x Tesla V100-PCIE-16GB
  • Hours used: 15.166
  • Cloud Provider: Azure
  • Compute Region: us-east
  • Carbon Emitted: 6.72 kg CO2eq (offset by provider)

Citation

BibTeX:

Forthcoming -- watch this space.

Model Card Authors

Chris von Csefalvay

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.3.2
  • Tokenizers 0.15.1