File size: 2,609 Bytes
8892775
 
343875a
8892775
 
2850262
8892775
343875a
2850262
8892775
 
 
2850262
 
 
8892775
 
 
 
 
 
 
d10dbd0
2850262
 
 
fca0d9f
 
2850262
343875a
8892775
 
 
2850262
 
 
 
 
 
 
8892775
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
343875a
8892775
 
 
 
3ceb401
 
 
 
343875a
 
 
3ceb401
 
8892775
 
 
 
 
2850262
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: mit
base_model: Sarmila/pubmed-bert-squad-covidqa
tags:
- generated_from_trainer
- biology
datasets:
- covid_qa_deepset
- squad
model-index:
- name: pubmed-bert-squad-covidqa
  results: []
language:
- en
pipeline_tag: question-answering
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# pubmed-bert-squad-covidqa

This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) on the squad qa first, covid_qa_deepset dataset.
It achieves the following results on the evaluation set for squad:
{'exact_match': 59.0, 'f1': 76.32473929579194}

- Loss   1.003116

It achieves the following results on the evaluation set for covidqa:
- Loss: 0.4876

## Model description

This model is trained with an intention of testing pumed bert bionlp language model for question answering pipeline.
While testing on our custom dataset, we reliazed that the model when used directly for QA did not perform well at all. Hence, we decided to train on covidqa 
to make model accustomed with answer extraction. While, covidqa data is very similar to what we intended to use, it is samll in number hence resulting not much improvement.

Therefore, we firt trained the model in squad dataset which is larger in number. Then, we trained the model for covid qa. Hence, squad helped model to learn how to extract answers and covid qa helped us to train the model on domain similar to ours i.e. biomedicine

further, we have first performed MLM using our dataset on pubmed bert bionlp and then performed exactly same 眉i眉eline to see the difference which is [here]

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| No log        | 1.0   | 51   | 0.4001          |
| No log        | 2.0   | 102  | 0.4524          |
| No log        | 3.0   | 153  | 0.4876          |


### Framework versions

- Transformers 4.33.0
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.13.3