Update README.md
Browse files
README.md
CHANGED
@@ -2,16 +2,24 @@
|
|
2 |
language:
|
3 |
- es
|
4 |
- qu
|
|
|
5 |
tags:
|
6 |
- quechua
|
7 |
- translation
|
8 |
- spanish
|
|
|
9 |
license: apache-2.0
|
|
|
10 |
metrics:
|
11 |
- bleu
|
12 |
- sacrebleu
|
|
|
13 |
widget:
|
14 |
-
- text: "
|
|
|
|
|
|
|
|
|
15 |
---
|
16 |
|
17 |
# t5-small-finetuned-spanish-to-quechua
|
@@ -20,19 +28,18 @@ This model is a finetuned version of the [t5-small](https://huggingface.co/t5-sm
|
|
20 |
|
21 |
## Model description
|
22 |
|
23 |
-
|
24 |
|
25 |
## Intended uses & limitations
|
26 |
|
27 |
-
|
28 |
|
29 |
### How to use
|
30 |
|
31 |
You can import this model as follows:
|
32 |
|
33 |
```python
|
34 |
-
>>> from transformers import AutoModelForSeq2SeqLM
|
35 |
-
>>> from transformers import AutoTokenizer
|
36 |
>>> model_name = 'hackathon-pln-es/t5-small-finetuned-spanish-to-quechua'
|
37 |
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
|
38 |
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
|
@@ -49,11 +56,11 @@ To translate you can do:
|
|
49 |
|
50 |
### Limitations and bias
|
51 |
|
52 |
-
|
53 |
|
54 |
## Training data
|
55 |
|
56 |
-
|
57 |
|
58 |
## Evaluation results
|
59 |
|
|
|
2 |
language:
|
3 |
- es
|
4 |
- qu
|
5 |
+
|
6 |
tags:
|
7 |
- quechua
|
8 |
- translation
|
9 |
- spanish
|
10 |
+
|
11 |
license: apache-2.0
|
12 |
+
|
13 |
metrics:
|
14 |
- bleu
|
15 |
- sacrebleu
|
16 |
+
|
17 |
widget:
|
18 |
+
- text: "Dios ama a los hombres"
|
19 |
+
- text: "A pesar de todo, soy feliz"
|
20 |
+
- text: "¿Qué harán allí?"
|
21 |
+
- text: "Debes aprender a respetar"
|
22 |
+
|
23 |
---
|
24 |
|
25 |
# t5-small-finetuned-spanish-to-quechua
|
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
+
t5-small-finetuned-spanish-to-quechua has trained for 46 epochs with 102 747 sentences, the validation was performed with 12 844 sentences and 12 843 sentences were used for the test.
|
32 |
|
33 |
## Intended uses & limitations
|
34 |
|
35 |
+
A large part of the dataset has been extracted from biblical texts, which makes the model perform better with certain types of sentences.
|
36 |
|
37 |
### How to use
|
38 |
|
39 |
You can import this model as follows:
|
40 |
|
41 |
```python
|
42 |
+
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
|
|
43 |
>>> model_name = 'hackathon-pln-es/t5-small-finetuned-spanish-to-quechua'
|
44 |
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
|
45 |
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
|
56 |
|
57 |
### Limitations and bias
|
58 |
|
59 |
+
Actually this model only can translate to Quechua of Ayacucho.
|
60 |
|
61 |
## Training data
|
62 |
|
63 |
+
For train this model we use [Spanish to Quechua dataset](https://huggingface.co/datasets/hackathon-pln-es/spanish-to-quechua)
|
64 |
|
65 |
## Evaluation results
|
66 |
|