The foundation this model
is the RoBERTa-style model
deepset/gbert-large.
Following Gururangan et al. (2020)
we gathered a collection of narrative fiction and
continued the models pre-training task with it.
The training is performed over 10 epochs on 2.3 GB of
text with a learning rate of 0.0001
(linear decrease) and a batch size of 512.
- Downloads last month
- 23
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.