File size: 2,369 Bytes
d34b292
7afe512
c0285b0
 
 
 
7afe512
 
c0285b0
 
d34b292
 
0e378f2
d34b292
0e378f2
 
 
d34b292
 
 
 
 
 
 
0e378f2
 
d34b292
0e378f2
 
 
c0285b0
0e378f2
c0285b0
d34b292
0e378f2
 
d34b292
c0285b0
 
0e378f2
 
 
 
 
 
d34b292
 
 
 
 
c0285b0
 
 
 
0e378f2
 
d34b292
 
 
 
 
0e378f2
d34b292
 
 
0e378f2
d34b292
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
language:
- en
- fr
- ro
- de
license: apache-2.0
library_name: transformers
datasets:
- c4
---

# Model Card for EncT5

EncT5 is a variant of T5 that utilizes mainly the encoder for non-autoregressive (ie. classification and regression)
tasks. The model is from the paper [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)
by Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li



## Model Details

### Model Description

EncT5 uses the same base weights at T5, but **must be fine-tuning before use**. There are several special features
to EncT5:

1. There are less decoder layers (a single decoder layer by default), and so has fewer parameters/resources than the
   standard T5.
3. There is a separate decoder word embedding, with the decoder input ids being predefined constants. During
   fine-tuning, the decoder embedding learns to use these constants as "prompts" to the encoder for the corresponding
   classification/regression tasks.
5. There is a classification head on top of the decoder output.

Research has shown that this model can be more efficient and usable over T5 and BERT for non-autoregressive
tasks such as classification and regression.

- **Developed by:** Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li. See the
  [associated paper](https://arxiv.org/abs/2110.08426).
- **Model type:** Language Model
- **Language(s) (NLP):** English, French, Romanian, German
- **License:** Apache 2.0
- **Based on model:** [T5](https://huggingface.co/google-t5/t5-base)
- **Repository:** [Github repro](https://github.com/hackyon/EncT5)
- **Paper:** [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)

## How to Get Started with the Model

Use the code below to get started with the model.

```python
model = AutoModelForSequenceClassification.from_pretrained("hackyon/enct5-base", trust_remote_code=True)
# Fine-tune the model before use.
```

See the [github repro](https://github.com/hackyon/EncT5) for a more comprehensive guide.

## Training Details

### Training Data

The weights of this model are directly copied from [t5-base](https://huggingface.co/google-t5/t5-base).

### Training Procedure 

This model **must be fine-tuned** before use. The decoder word embedding and classification head are both untrained.