Overview

Language model: gbert-large-sts

Language: German
Training data: German STS benchmark train and dev set
Eval data: German STS benchmark test set
Infrastructure: 1x V100 GPU
Published: August 12th, 2021

Details

  • We trained a gbert-large model on the task of estimating semantic similarity of German-language text pairs. The dataset is a machine-translated version of the STS benchmark, which is available here.

Hyperparameters

batch_size = 16
n_epochs = 4
warmup_ratio = 0.1
learning_rate = 2e-5
lr_schedule = LinearWarmup

Performance

Stay tuned... and watch out for new papers on arxiv.org ;)

Authors

  • Julian Risch: julian.risch [at] deepset.ai
  • Timo Möller: timo.moeller [at] deepset.ai
  • Julian Gutsch: julian.gutsch [at] deepset.ai
  • Malte Pietsch: malte.pietsch [at] deepset.ai

About us

deepset is the company behind the production-ready open-source AI framework Haystack.

Some of our other work:

Get in touch and join the Haystack community

For more info on Haystack, visit our GitHub repo and Documentation.

We also have a Discord community open to everyone!

Twitter | LinkedIn | Discord | GitHub Discussions | Website | YouTube

By the way: we're hiring!

Downloads last month
35
Safetensors
Model size
336M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepset/gbert-large-sts

Finetunes
1 model