Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,91 @@ license: mit
|
|
4 |
inference:
|
5 |
parameters:
|
6 |
aggregation_strategy: "average"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
inference:
|
5 |
parameters:
|
6 |
aggregation_strategy: "average"
|
7 |
+
|
8 |
+
language:
|
9 |
+
- pt
|
10 |
+
pipeline_tag: fill-mask
|
11 |
+
tags:
|
12 |
+
- medialbertina-ptpt
|
13 |
+
- deberta
|
14 |
+
- portuguese
|
15 |
+
- european portuguese
|
16 |
+
- medical
|
17 |
+
- clinical
|
18 |
+
- healthcare
|
19 |
+
- NER
|
20 |
+
- Named Entity Recognition
|
21 |
+
- IE
|
22 |
+
- Information Extraction
|
23 |
+
widget:
|
24 |
+
- text: Durante a cirurgia ortopédica para corrigir a fratura no tornozelo, os sinais vitais do utente, incluindo a pressão arterial, com leitura de 120/87 mmHg, a frequência cardíaca, de 80 batimentos por minuto, e SpO2 a 98%, foram monitorizados. Após a cirurgia o utente apresentava dor intensa no local e inchaço no tornozelo, mas os resultados dos exames de radiografia revelaram uma recuperação satisfatória.
|
25 |
+
example_title: Example 1
|
26 |
+
- text: Durante o procedimento endoscópico, foram encontrados pólipos no cólon do paciente.
|
27 |
+
example_title: Example 2
|
28 |
+
- text: Foi recomendada aspirina de 500mg a cada 4 horas, durante 3 dias.
|
29 |
+
example_title: Example 3
|
30 |
+
- text: Após as sessões de fisioterapia o paciente apresenta recuperação de mobilidade.
|
31 |
+
example_title: Example 4
|
32 |
+
- text: O paciente está em Quimioterapia com uma dosagem específica de Cisplatina para o tratamento do cancro do pulmão.
|
33 |
+
example_title: Example 5
|
34 |
+
- text: Monitorização da Freq. cardíaca com 90 bpm. P Arterial de 120-80 mmHg
|
35 |
+
example_title: Example 6
|
36 |
+
- text: A ressonância magnética da utente revelou uma ruptura no menisco lateral do joelho.
|
37 |
+
example_title: Example 7
|
38 |
+
- text: A paciente foi diagnosticada com esclerose múltipla e iniciou terapia com imunomoduladores.
|
39 |
---
|
40 |
+
|
41 |
+
# MediAlbertina
|
42 |
+
The first publicly available medical language models trained with real European Portuguese data.
|
43 |
+
|
44 |
+
MediAlbertina is a family of encoders from the Bert family, DeBERTaV2-based, resulting from the continuation of the pre-training of [PORTULAN's Albertina](https://huggingface.co/PORTULAN) models with Electronic Medical Records shared by Portugal's largest public hospital.
|
45 |
+
|
46 |
+
Like its antecessors, MediAlbertina models are distributed under the [MIT license](https://huggingface.co/portugueseNLP/medialbertina_pt-pt_900m/blob/main/LICENSE).
|
47 |
+
|
48 |
+
|
49 |
+
|
50 |
+
# Model Description
|
51 |
+
|
52 |
+
MediAlbertina PT-PT 900M NER was created through domain adaptation of [MediAlbertina PT-PT 900M](https://huggingface.co/portugueseNLP/medialbertina_pt-pt_900m) on real European Portuguese EMRs that have been hand-annotated for the following entities:
|
53 |
+
- Diagnostico
|
54 |
+
- Sintoma
|
55 |
+
- Medicamento
|
56 |
+
- Dosagem
|
57 |
+
- ProcedimentoMedico
|
58 |
+
- SinalVital
|
59 |
+
- Resultado
|
60 |
+
- Progresso
|
61 |
+
-
|
62 |
+
MediAlbertina PT-PT 900M NER achieved superior results to the same adaptation made on a non-medical Portuguese language model, demonstrating the effectiveness of this domain adaptation, and its potential for medical AI in Portugal.
|
63 |
+
|
64 |
+
| Model | NER single-model | NER multi-models | Assertion Status |
|
65 |
+
|-------------------------|:----------------:|:----------------:|:----------------:|
|
66 |
+
| | F1-score | F1-score | F1-score |
|
67 |
+
|albertina-900m-portuguese-ptpt-encoder | 0.813 | 0.811 | 0.687 |
|
68 |
+
| **medialbertina_pt-pt_900m** | **0.832** | **0.848** | **0.755** |
|
69 |
+
|
70 |
+
## Data
|
71 |
+
|
72 |
+
MediAlbertina PT-PT 900M NER was fine-tuned on more than 10k hand-annotated entities from more than a thousand fully anonymized medical sentences from Portugal's largest public hospital. This data was acquired under the framework of the [FCT project DSAIPA/AI/0122/2020 AIMHealth-Mobile Applications Based on Artificial Intelligence](https://ciencia.iscte-iul.pt/projects/aplicacoes-moveis-baseadas-em-inteligencia-artificial-para-resposta-de-saude-publica/1567).
|
73 |
+
|
74 |
+
|
75 |
+
## How to use
|
76 |
+
|
77 |
+
```Python
|
78 |
+
from transformers import pipeline
|
79 |
+
|
80 |
+
ner_pipeline = pipeline('ner', model='portugueseNLP/medialbertina_pt-pt_900m_NER', aggregation_strategy='average')
|
81 |
+
sentence = 'Durante o procedimento endoscópico, foram encontrados pólipos no cólon do paciente.'
|
82 |
+
entities = ner_pipeline(sentence)
|
83 |
+
for entity in entities:
|
84 |
+
print(f"{entity['entity_group']} - {sentence[entity['start']:entity['end']]}")
|
85 |
+
```
|
86 |
+
|
87 |
+
## Citation
|
88 |
+
|
89 |
+
MediAlbertina is developed by a joint team from [ISCTE-IUL](https://www.iscte-iul.pt/), Portugal, and [Select Data](https://selectdata.com/), CA USA. For a fully detailed description, check the respective publication:
|
90 |
+
|
91 |
+
```latex
|
92 |
+
In publishing process. Reference will be added soon.
|
93 |
+
```
|
94 |
+
Please use the above cannonical reference when using or citing this model.
|