File size: 878 Bytes
c5c9564 054bc60 9a3ac9d e2ca893 9a3ac9d c5c9564 9a3ac9d c5c9564 9a3ac9d 3f75b60 e2ca893 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d c5c9564 9a3ac9d e2ca893 9a3ac9d e2ca893 9a3ac9d c5c9564 e2ca893 9a3ac9d e2ca893 9a3ac9d c5c9564 9a3ac9d c5c9564 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
datasets:
- HuggingFaceFW/fineweb
language:
- en
---
# Encoder-Decoder model with DeBERTa encoder
## pre-trained models
- Encoder: `microsoft/deberta-v3-small`
- Decoder: `deliciouscat/deberta-v3-base-decoder-v0.1` (6 transformer layers, 8 attention heads)
-> 297511524(298M) params
## Data used
`HuggingFaceFW/fineweb` -> sampled 124800
## Training hparams
- optimizer: AdamW, lr=2.3e-5, betas=(0.875, 0.997)
- batch size: 12 (maximal on Colab pro A100 env)
-> training on denoising objective (BART)
## How to use
```
from transformers import AutoTokenizer, EncoderDecoderModel
model = EncoderDecoderModel.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.2")
tokenizer = AutoTokenizer.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.2")
```
## Future work!
- train more scientific data
- fine-tune on keyword extraction task |