pmaitra's picture
Update README.md
9588289
|
raw
history blame
2.47 kB
---
tags:
- spacy
- token-classification
language:
- en
license: mit
model-index:
- name: en_biobert_ner_symptom
results:
- task:
name: NER
type: token-classification
metrics:
- name: NER Precision
type: precision
value: 0.9997017596
- name: NER Recall
type: recall
value: 0.9994036971
- name: NER F Score
type: f_score
value: 0.9995527061
widget:
- text: "Patient X reported coughing and sneezing."
example_title: "Example 1"
- text: "There was a case of rash and inflammation."
example_title: "Example 2"
- text: "He complained of dizziness during the trip."
example_title: "Example 3"
- text: "I felt distressed , giddy and nauseous during my stay in Florida."
example_title: "Example 4"
- text: "Mr. Y complained of breathlesness and chest pain when he was driving back to his house."
example_title: "Example 5"
---
BioBERT based NER model for medical symptoms
| Feature | Description |
| --- | --- |
| **Name** | `en_biobert_ner_symptom` |
| **Version** | `1.0.0` |
| **spaCy** | `>=3.5.1,<3.6.0` |
| **Default Pipeline** | `transformer`, `ner` |
| **Components** | `transformer`, `ner` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | `MIT` |
| **Author** | [Sena Chae, Pratik Maitra, Padmini Srinivasan]() |
<b> <u> Model Description </u> </b>
The model was trained on a combined maccrobat and i2c2 dataset and is based on biobert. If you use this model kindly cite the paper below:
<b>
<i>
Developing a BioBERT-based Natural Language Processing Algorithm for Acute Myeloid Leukemia Symptoms Identification from Clinical Notes - Sena Chae , Pratik Maitra , Padmini Srinivasan
</i>
</b>
<b> <u> How to use the Model </u> </b>
<div class="wrapper">
<span class="inner">
from transformers import pipeline, AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("d4data/biomedical-ner-all")
model = AutoModelForTokenClassification.from_pretrained("d4data/biomedical-ner-all")
pipe = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple") # pass device=0 if using gpu
pipe("""The patient reported no recurrence of palpitations at follow-up 6 months after the ablation.""")
</span>
</div>
### Accuracy
| Type | Score |
| --- | --- |
| `ENTS_F` | 99.96 |
| `ENTS_P` | 99.97 |
| `ENTS_R` | 99.94 |
| `TRANSFORMER_LOSS` | 20456.83 |
| `NER_LOSS` | 38920.06 |