|
--- |
|
language: |
|
- es |
|
- qu |
|
tags: |
|
- quechua |
|
- translation |
|
- spanish |
|
license: apache-2.0 |
|
--- |
|
|
|
# t5-small-finetuned-spanish-to-quechua |
|
|
|
This model is a finetuned version of the [t5-small](https://huggingface.co/t5-small). |
|
|
|
## Model description |
|
|
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
|
|
### How to use |
|
|
|
You can import this model as follows: |
|
|
|
```python |
|
>>> from transformers import AutoModelForSeq2SeqLM |
|
>>> from transformers import AutoTokenizer |
|
>>> model_name = 'hackathon-pln-es/t5-small-finetuned-spanish-to-quechua' |
|
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name) |
|
>>> tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
``` |
|
|
|
To translate you can do: |
|
|
|
```python |
|
>>> sentence = "Entonces dijo" |
|
>>> input = tokenizer(text, return_tensors="pt") |
|
>>> output = model.generate(input["input_ids"], max_length=40, num_beams=4, early_stopping=True) |
|
>>> print('Original Sentence: {} \nTranslated sentence: {}'.format(sentence, tokenizer.decode(output[0]))) |
|
``` |
|
|
|
### Limitations and bias |
|
|
|
|
|
|
|
## Training data |
|
|
|
|
|
|
|
## Evaluation results |
|
|
|
We obtained the following metrics during the training process: |
|
|
|
`eval_bleu = 2.9691` |
|
`eval_loss = 1.2064628601074219` |
|
|