ClinicalNER

Model Description

This is a multilingual clinical NER model extracting DRUG, STRENGTH, FREQUENCY, DURATION, DOSAGE and FORM entities from a medical text.

It consist of XLM-R Base fine-tuned on n2c2 (English). It is the model that obtains the best results on our French evaluation test set MedNERF in a zero-shot cross-lingual transfer setting.

Evaluation Metrics on MedNERF dataset

  • Loss: 0.692
  • Accuracy: 0.859
  • Precision: 0.817
  • Recall: 0.791
  • micro-F1: 0.804
  • macro-F1: 0.819

Usage

from transformers import AutoModelForTokenClassification, AutoTokenizer

model = AutoModelForTokenClassification.from_pretrained("Posos/ClinicalNER")
tokenizer = AutoTokenizer.from_pretrained("Posos/ClinicalNER")

inputs = tokenizer("Take 2 pills every morning", return_tensors="pt")
outputs = model(**inputs)

Citation information

@inproceedings{mednerf,
    title = "Multilingual Clinical NER: Translation or Cross-lingual Transfer?",
    author = "Gaschi, Félix and Fontaine, Xavier and Rastin, Parisa and Toussaint, Yannick",
    booktitle = "Proceedings of the 5th Clinical Natural Language Processing Workshop",
    publisher = "Association for Computational Linguistics",
    year = "2023"
}
Downloads last month
196
Safetensors
Model size
277M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Posos/ClinicalNER

Evaluation results