arxiv:2010.10906

German's Next Language Model

Published on Oct 21, 2020

Authors:

Stefan Schweter ,

Abstract

In this work we present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA. By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition (NER) tasks for both models of base and large size. We adopt an evaluation driven approach in training these models and our results indicate that both adding more data and utilizing WWM improve model performance. By benchmarking against existing German models, we show that these models are the best German models to date. Our trained models will be made publicly available to the research community.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 7

Browse 7 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2010.10906 in a dataset README.md to link it from this page.

Spaces citing this paper 5

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.