Model Card for mT5-small-VT_deplain-apa
Finetuned mT5-Model for German sentence-level text-simplification.
Model Details
Model Description
- Model type: Encoder-Decoder-Transformer
- Language(s) (NLP): German
- Finetuned from model: google/mT5-small
- Task: Text-Simplification
Training Details
Training Data
DEplain/DEplain-APA-sent Stodden et al. (2023):arXiv:2305.18939
Training Procedure
Parameter-efficient Fine-Tuning with LoRA.
Vocabulary adjusted through Vocabulary Transfer (Mosin et al. (2021): arXiv:2112.14569).
Training Hyperparameters
- Batch Size: 16
- Epochs: 1
- Learning Rate: 0.001
- Optimizer: Adafactor
LoRA Hyperparameters
- R: 32
- Alpha: 64
- Dropout: 0.1
- Target modules: all linear layers
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.