|
--- |
|
language: |
|
- en |
|
tags: |
|
- automotive |
|
--- |
|
|
|
WG-BERT (Warranty and Goodwill) is a pretrained encoder based model to analyze automotive entities in automotive-related texts. WG-BERT is trained by continually |
|
pretraining the BERT language model in the automotive domain by using a corpus of automotive (workshop feedback) texts via the masked language modeling (MLM) approach. |
|
WG-BERT is further fine-tuned for automotive entity recognition (subtask of Named Entity Recognition (NER)) to extract components and their complaints out of automotive texts. |
|
The dataset for continual pretraining consists of 1.8 million workshop feedback texts which contain ~4 million sentences. |
|
The dataset for fine-tuning consists of ~5.500 gold annotated sentences by automotive domain experts. |
|
We choose as the training architecture the BERT-base-uncased version. |
|
|
|
Please contact Lukas Weber lukas-weber[at]hotmail[dot]de / lukas.l.weber[at]mercedes-benz[dot]com about any WG-BERT related issues and questions. |
|
|