A small version of DeBERTa trained on the clean version of google C4 dataset. For more info about the size of the model, see config.json.

The model has been trained for 100K steps with a batch size of 2048 and a sequence length of 512, for a total of 104B tokens.

The vocabulary and the tokenizer are the same as microsoft/deberta-base.

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

lucadiliello
/

deberta-small

Dataset used to train lucadiliello/deberta-small