COVID 19 Bio Annotations
The dataset was taken from https://github.com/davidcampos/covid19-corpus
Dataset The dataset was then split into several datasets each one representing one entity. Namely, Disorder, Species, Chemical or Drug, Gene and Protein, Enzyme, Anatomy, Biological Process, Molecular Function, Cellular Component, Pathway and microRNA. Moreover, another dataset is also created with all those aforementioned that are non-overlapping in nature.
Other Dataset Formats The datasets are available in two formats IOB and Spacy's JSONL format.
IOB: https://github.com/tsantosh7/COVID-19-Named-Entity-Recognition/tree/master/Datasets/BIO
SpaCy JSONL: https://github.com/tsantosh7/COVID-19-Named-Entity-Recognition/tree/master/Datasets/SpaCy
Feature | Description |
---|---|
Name | en_covid19_ner |
Version | 0.0.0 |
spaCy | >=3.2.4,<3.3.0 |
Default Pipeline | transformer , ner |
Components | transformer , ner |
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | n/a |
License | n/a |
Author | Santosh Tirunagai |
Label Scheme
View label scheme (10 labels for 1 components)
Component | Labels |
---|---|
ner |
ANAT , CHED , COMP , DISO , ENZY , FUNC , PATH , PRGE , PROC , SPEC |
Accuracy
Type | Score |
---|---|
ENTS_F |
92.50 |
ENTS_P |
91.40 |
ENTS_R |
93.62 |
TRANSFORMER_LOSS |
311768.03 |
NER_LOSS |
371171.50 |
- Downloads last month
- 32
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- NER Precisionself-reported0.914
- NER Recallself-reported0.936
- NER F Scoreself-reported0.925