The foundation this model is the RoBERTa-style model deepset/gbert-large.
Following Gururangan et al. (2020) we gathered a collection of narrative fiction and continued the models pre-training task with it.
The training is performed over 10 epochs on 2.3 GB of text with a learning rate of 0.0001 (linear decrease) and a batch size of 512.

Downloads last month
23
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.