--- license: apache-2.0 base_model: Helsinki-NLP/opus-mt-en-ar tags: - translation - generated_from_trainer model-index: - name: text2gloss_ar results: [] library_name: transformers pipeline_tag: translation --- # text2gloss_ar This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ar](https://huggingface.co/Helsinki-NLP/opus-mt-en-ar) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0306 - Word Bleu: 97.0831 - Char Bleu: 98.9391 ## Model description - Source: Text (spoken text) - Target: gloss (ArSL gloss) - Domain: ArSL Friday sermon translation from text to gloss We used a pre-trained model (apus_mt) for domain specification. ## Intended uses & limitations - Data Specificity: The model is trained specifically on Arabic text and ArSL glosses. It may not perform well when applied to other languages or sign languages. - Contextual Accuracy: While the model handles straightforward translations effectively, it might struggle with complex sentences or phrases that require a deep understanding of context, especially when combining or shuffling sentences. - Generalization to Unseen Data: The model’s performance may degrade when exposed to text that significantly differs in style or content from the training data, such as highly specialized jargon or informal language. - Gloss Representation: The model translates text into glosses, which are a written representation of sign language but do not capture the full complexity of sign language grammar and non-manual signals (facial expressions, body language). - Test Dataset Limitations: The test dataset used is a shortened version of a sermon that does not cover all possible sentence structures and contexts, which may limit the model’s ability to generalize to other domains. - Ethical Considerations: Care must be taken when deploying this model in real-world applications, as misinterpretations or inaccuracies in translation can lead to misunderstandings, especially in sensitive communications. ## Training and evaluation data - Dataset size before augmentation: 131 - Dataset size after augmentation: 8646 - (For training and validation): Augmented Dataset Splitter: - train: 7349 - validation: 1297 - (For testing): We used a dataset that contained the actual scenario of the Friday sermon phrases to generate a short Friday sermon. ## Training procedure ## 1- Train and Evaluation Result: - Train and Evaluation Loss: 0.464023 - Train and Evaluation Word BLEU Score: 97.08 - Train and Evaluation Char BLEU Score: 98.94 - Train and Evaluation Runtime (seconds): 562.8277 - Train and Evaluation Samples per Second: 391.718 - Train and Evaluation Steps per Second: 12.26 - Test Results: ## 2- Test Loss: 0.289312 - Test Word BLEU Score: 76.92 - Test Char BLEU Score: 86.30 - Test Runtime (seconds): 1.1038 - Test Samples per Second: 41.67 - Test Steps per Second: 0.91 ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 32 - eval_batch_size: 64 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 30 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Word Bleu | Char Bleu | |:-------------:|:-----:|:----:|:---------------:|:---------:|:---------:| | 2.726 | 1.0 | 230 | 0.8206 | 24.8561 | 42.0470 | | 0.6983 | 2.0 | 460 | 0.3166 | 61.8643 | 74.7375 | | 0.3167 | 3.0 | 690 | 0.1288 | 85.4787 | 92.1539 | | 0.1599 | 4.0 | 920 | 0.0699 | 92.9287 | 97.2020 | | 0.0971 | 5.0 | 1150 | 0.0504 | 94.6364 | 97.6967 | | 0.0626 | 6.0 | 1380 | 0.0383 | 96.3441 | 98.6000 | | 0.0507 | 7.0 | 1610 | 0.0396 | 95.9440 | 98.5028 | | 0.036 | 8.0 | 1840 | 0.0364 | 96.0036 | 98.3957 | | 0.0289 | 9.0 | 2070 | 0.0306 | 97.0831 | 98.9391 | ### Framework versions - Transformers 4.42.4 - Pytorch 1.12.0+cu102 - Datasets 2.21.0 - Tokenizers 0.19.1