--- license: apache-2.0 base_model: Helsinki-NLP/opus-mt-id-en tags: - generated_from_keras_callback model-index: - name: aditnnda/machine_translation_informal2formal results: [] --- # aditnnda/machine_translation_informal2formal This model is a fine-tuned version of [Helsinki-NLP/opus-mt-id-en](https://huggingface.co/Helsinki-NLP/opus-mt-id-en) on [STIF Indonesia](haryoaw/stif-indonesia) dataset. It achieves the following results on the evaluation set: - Train Loss: 0.0077 - Validation Loss: 1.2870 - Epoch: 99 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 6000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01} - training_precision: float32 ### Training results | Train Loss | Validation Loss | Epoch | |:----------:|:---------------:|:-----:| | 3.4298 | 2.4070 | 0 | | 2.1508 | 1.8031 | 1 | | 1.6301 | 1.5249 | 2 | | 1.3013 | 1.3417 | 3 | | 1.0752 | 1.2465 | 4 | | 0.9119 | 1.1651 | 5 | | 0.7778 | 1.1213 | 6 | | 0.6763 | 1.0813 | 7 | | 0.5907 | 1.0542 | 8 | | 0.5162 | 1.0289 | 9 | | 0.4573 | 1.0265 | 10 | | 0.4057 | 1.0115 | 11 | | 0.3645 | 1.0096 | 12 | | 0.3227 | 1.0037 | 13 | | 0.2864 | 1.0016 | 14 | | 0.2598 | 1.0121 | 15 | | 0.2291 | 1.0079 | 16 | | 0.2069 | 1.0199 | 17 | | 0.1876 | 1.0247 | 18 | | 0.1717 | 1.0199 | 19 | | 0.1544 | 1.0283 | 20 | | 0.1393 | 1.0416 | 21 | | 0.1285 | 1.0370 | 22 | | 0.1171 | 1.0430 | 23 | | 0.1069 | 1.0593 | 24 | | 0.0990 | 1.0670 | 25 | | 0.0915 | 1.0655 | 26 | | 0.0827 | 1.0818 | 27 | | 0.0781 | 1.0903 | 28 | | 0.0729 | 1.0998 | 29 | | 0.0678 | 1.0932 | 30 | | 0.0639 | 1.1051 | 31 | | 0.0592 | 1.1125 | 32 | | 0.0556 | 1.1240 | 33 | | 0.0509 | 1.1177 | 34 | | 0.0512 | 1.1355 | 35 | | 0.0438 | 1.1405 | 36 | | 0.0453 | 1.1322 | 37 | | 0.0443 | 1.1419 | 38 | | 0.0407 | 1.1419 | 39 | | 0.0397 | 1.1495 | 40 | | 0.0386 | 1.1609 | 41 | | 0.0346 | 1.1619 | 42 | | 0.0351 | 1.1638 | 43 | | 0.0344 | 1.1711 | 44 | | 0.0302 | 1.1782 | 45 | | 0.0470 | 1.1836 | 46 | | 0.0330 | 1.1913 | 47 | | 0.0284 | 1.1963 | 48 | | 0.0268 | 1.1964 | 49 | | 0.0255 | 1.2017 | 50 | | 0.0236 | 1.2092 | 51 | | 0.0241 | 1.2104 | 52 | | 0.0234 | 1.2170 | 53 | | 0.0216 | 1.2192 | 54 | | 0.0209 | 1.2317 | 55 | | 0.0205 | 1.2289 | 56 | | 0.0193 | 1.2363 | 57 | | 0.0191 | 1.2295 | 58 | | 0.0184 | 1.2306 | 59 | | 0.0185 | 1.2352 | 60 | | 0.0184 | 1.2415 | 61 | | 0.0174 | 1.2389 | 62 | | 0.0166 | 1.2392 | 63 | | 0.0167 | 1.2469 | 64 | | 0.0166 | 1.2457 | 65 | | 0.0147 | 1.2456 | 66 | | 0.0146 | 1.2511 | 67 | | 0.0147 | 1.2552 | 68 | | 0.0147 | 1.2493 | 69 | | 0.0133 | 1.2532 | 70 | | 0.0135 | 1.2561 | 71 | | 0.0136 | 1.2609 | 72 | | 0.0130 | 1.2602 | 73 | | 0.0119 | 1.2629 | 74 | | 0.0123 | 1.2667 | 75 | | 0.0114 | 1.2675 | 76 | | 0.0122 | 1.2673 | 77 | | 0.0111 | 1.2649 | 78 | | 0.0099 | 1.2722 | 79 | | 0.0109 | 1.2693 | 80 | | 0.0101 | 1.2727 | 81 | | 0.0101 | 1.2746 | 82 | | 0.0096 | 1.2739 | 83 | | 0.0103 | 1.2734 | 84 | | 0.0096 | 1.2805 | 85 | | 0.0093 | 1.2799 | 86 | | 0.0097 | 1.2823 | 87 | | 0.0093 | 1.2826 | 88 | | 0.0095 | 1.2808 | 89 | | 0.0091 | 1.2875 | 90 | | 0.0081 | 1.2849 | 91 | | 0.0084 | 1.2849 | 92 | | 0.0083 | 1.2838 | 93 | | 0.0089 | 1.2866 | 94 | | 0.0084 | 1.2851 | 95 | | 0.0082 | 1.2870 | 96 | | 0.0078 | 1.2871 | 97 | | 0.0078 | 1.2872 | 98 | | 0.0077 | 1.2870 | 99 | ### Framework versions - Transformers 4.35.2 - TensorFlow 2.14.0 - Datasets 2.15.0 - Tokenizers 0.15.0