File size: 878 Bytes
c5c9564
 
 
 
 
 
054bc60
9a3ac9d
e2ca893
9a3ac9d
c5c9564
9a3ac9d
c5c9564
9a3ac9d
3f75b60
 
e2ca893
9a3ac9d
e2ca893
9a3ac9d
e2ca893
9a3ac9d
c5c9564
 
 
 
 
9a3ac9d
e2ca893
9a3ac9d
e2ca893
 
9a3ac9d
c5c9564
 
e2ca893
9a3ac9d
e2ca893
9a3ac9d
c5c9564
9a3ac9d
c5c9564
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
datasets:
- HuggingFaceFW/fineweb
language:
- en
---
# Encoder-Decoder model with DeBERTa encoder

## pre-trained models

- Encoder: `microsoft/deberta-v3-small`

- Decoder: `deliciouscat/deberta-v3-base-decoder-v0.1` (6 transformer layers, 8 attention heads)

-> 297511524(298M) params

## Data used

`HuggingFaceFW/fineweb` -> sampled 124800

## Training hparams

- optimizer: AdamW, lr=2.3e-5, betas=(0.875, 0.997)

- batch size: 12 (maximal on Colab pro A100 env)

-> training on denoising objective (BART)

## How to use

```
from transformers import AutoTokenizer, EncoderDecoderModel

model = EncoderDecoderModel.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.2")
tokenizer = AutoTokenizer.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.2")
```

## Future work!

- train more scientific data

- fine-tune on keyword extraction task