osiria commited on
Commit
59c9591
1 Parent(s): e93341a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -52,6 +52,10 @@ pipeline_tag: question-answering
52
 
53
  This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>, using <b>DeBERTa-ITALIAN</b> ([deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian)) as a pre-trained model.
54
 
 
 
 
 
55
  <h3>Training and Performances</h3>
56
 
57
  The model is trained to perform question answering, given a context and a question (under the assumption that the context contains the answer to the question). It has been fine-tuned for Extractive Question Answering, using the SQuAD-IT dataset, for 2 epochs with a linearly decaying learning rate starting from 3e-5, maximum sequence length of 384 and document stride of 128.
@@ -59,9 +63,11 @@ The model is trained to perform question answering, given a context and a questi
59
 
60
  The performances on the test set are reported in the following table:
61
 
 
 
62
  | EM | F1 |
63
  | ------ | ------ |
64
- | 68.80 | 80.08 |
65
 
66
  Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/main/osiria_deberta_italian_qa_evaluation.ipynb
67
 
 
52
 
53
  This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>, using <b>DeBERTa-ITALIAN</b> ([deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian)) as a pre-trained model.
54
 
55
+ <b>Update: Version 2.0</b>
56
+
57
+ This version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and initial learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, initial learning rate of 3e-5, but no warmup)
58
+
59
  <h3>Training and Performances</h3>
60
 
61
  The model is trained to perform question answering, given a context and a question (under the assumption that the context contains the answer to the question). It has been fine-tuned for Extractive Question Answering, using the SQuAD-IT dataset, for 2 epochs with a linearly decaying learning rate starting from 3e-5, maximum sequence length of 384 and document stride of 128.
 
63
 
64
  The performances on the test set are reported in the following table:
65
 
66
+ (version 2.0 performances)
67
+
68
  | EM | F1 |
69
  | ------ | ------ |
70
+ | 70.04 | 80.97 |
71
 
72
  Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/main/osiria_deberta_italian_qa_evaluation.ipynb
73