Fine-tuning model
Hi, I found this model very interesting and wanted to use it for my task. Also, I wanted to fine-tune it with my own data.
So I used this link as a training guide: https://spacy.io/usage/training#basics
Also, I used your it_nerIta_trf/config.cfg
file to do the fine-tuning.
Starting the training shows this:
=========================== Initializing pipeline ===========================
[2022-12-05 15:22:00,040] [INFO] Set up nlp object from config
[2022-12-05 15:22:00,046] [INFO] Pipeline: ['transformer', 'ner']
[2022-12-05 15:22:00,048] [INFO] Created vocabulary
[2022-12-05 15:22:00,049] [INFO] Finished initializing nlp object
Some weights of the model checkpoint at bullmount/hseBert-it-cased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertModel were not initialized from the model checkpoint at bullmount/hseBert-it-cased and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2022-12-05 15:22:06,822] [INFO] Initialized pipeline components: ['transformer', 'ner']
Training seems to proceed well, albeit slowly. But I can't figure out which one case I am in and whether this can cause problems in training:
This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Incidentally, this happens even if I use a standard configuration file provided by the spacy site mentioned above.
Can you help me?
Thanks in advance.