The foundation this model is the RoBERTa-style model deepset/gbert-large.
Following Gururangan et al. (2020) we gathered a collection of narrative fiction and continued the models pre-training task with it.
The training is performed over 10 epochs on 2.3 GB of text with a learning rate of 0.0001 (linear decrease) and a batch size of 512.

Downloads last month: 23

Inference Examples

Feature Extraction

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.