rahimunisab
commited on
Commit
•
25ff595
1
Parent(s):
162f41c
Update README.md
Browse files
README.md
CHANGED
@@ -11,43 +11,39 @@ pipeline_tag: translation
|
|
11 |
---
|
12 |
|
13 |
|
14 |
-
## Finetuning
|
15 |
|
16 |
-
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
facebook/mbart-large-50-many-to-many-mmt finetuned for translation task in Telugu language
|
22 |
|
23 |
-
|
24 |
## Training and evaluation data
|
25 |
|
26 |
-
ai4bharath/samanantar
|
27 |
|
28 |
-
|
29 |
-
## Training hyperparameters
|
30 |
|
31 |
The following hyperparameters were used during training:
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
total_train_batch_size: 8
|
36 |
-
|
37 |
-
num_epochs: 11
|
38 |
|
39 |
## Benchamark Evaluation
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
BLUE score on IN-22: 26.069430553765887
|
44 |
|
45 |
## Framework versions
|
46 |
|
47 |
-
Transformers 4.42.3
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
Datasets 2.20.0
|
52 |
-
|
53 |
-
Tokenizers 0.19.1
|
|
|
11 |
---
|
12 |
|
13 |
|
|
|
14 |
|
15 |
+
# Finetuning
|
16 |
|
17 |
+
This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on the samanantar dataset.
|
18 |
+
|
19 |
+
source group: English
|
20 |
+
|
21 |
+
target group: Telugu
|
22 |
+
|
23 |
+
model: transformer
|
24 |
|
25 |
## Model description
|
26 |
|
27 |
facebook/mbart-large-50-many-to-many-mmt finetuned for translation task in Telugu language
|
28 |
|
|
|
29 |
## Training and evaluation data
|
30 |
|
31 |
+
ai4bharath/samanantar
|
32 |
|
33 |
+
### Training hyperparameters
|
|
|
34 |
|
35 |
The following hyperparameters were used during training:
|
36 |
+
- learning_rate: 2e-5
|
37 |
+
- total_train_batch_size: 8
|
38 |
+
- num_epochs: 1
|
|
|
|
|
|
|
39 |
|
40 |
## Benchamark Evaluation
|
41 |
+
-BLEU score on Tatoeba: 11.208466750961147
|
42 |
+
-BLUE score on IN-22: 26.069430553765887
|
|
|
|
|
43 |
|
44 |
## Framework versions
|
45 |
|
46 |
+
-Transformers 4.42.3
|
47 |
+
-Pytorch 2.1.2
|
48 |
+
-Datasets 2.20.0
|
49 |
+
-Tokenizers 0.19.1
|
|
|
|
|
|