File size: 2,369 Bytes
d34b292 7afe512 c0285b0 7afe512 c0285b0 d34b292 0e378f2 d34b292 0e378f2 d34b292 0e378f2 d34b292 0e378f2 c0285b0 0e378f2 c0285b0 d34b292 0e378f2 d34b292 c0285b0 0e378f2 d34b292 c0285b0 0e378f2 d34b292 0e378f2 d34b292 0e378f2 d34b292 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
---
language:
- en
- fr
- ro
- de
license: apache-2.0
library_name: transformers
datasets:
- c4
---
# Model Card for EncT5
EncT5 is a variant of T5 that utilizes mainly the encoder for non-autoregressive (ie. classification and regression)
tasks. The model is from the paper [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)
by Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li
## Model Details
### Model Description
EncT5 uses the same base weights at T5, but **must be fine-tuning before use**. There are several special features
to EncT5:
1. There are less decoder layers (a single decoder layer by default), and so has fewer parameters/resources than the
standard T5.
3. There is a separate decoder word embedding, with the decoder input ids being predefined constants. During
fine-tuning, the decoder embedding learns to use these constants as "prompts" to the encoder for the corresponding
classification/regression tasks.
5. There is a classification head on top of the decoder output.
Research has shown that this model can be more efficient and usable over T5 and BERT for non-autoregressive
tasks such as classification and regression.
- **Developed by:** Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li. See the
[associated paper](https://arxiv.org/abs/2110.08426).
- **Model type:** Language Model
- **Language(s) (NLP):** English, French, Romanian, German
- **License:** Apache 2.0
- **Based on model:** [T5](https://huggingface.co/google-t5/t5-base)
- **Repository:** [Github repro](https://github.com/hackyon/EncT5)
- **Paper:** [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)
## How to Get Started with the Model
Use the code below to get started with the model.
```python
model = AutoModelForSequenceClassification.from_pretrained("hackyon/enct5-base", trust_remote_code=True)
# Fine-tune the model before use.
```
See the [github repro](https://github.com/hackyon/EncT5) for a more comprehensive guide.
## Training Details
### Training Data
The weights of this model are directly copied from [t5-base](https://huggingface.co/google-t5/t5-base).
### Training Procedure
This model **must be fine-tuned** before use. The decoder word embedding and classification head are both untrained.
|