sabbas commited on
Commit
6460b44
·
verified ·
1 Parent(s): 2d353d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -24,17 +24,51 @@ It achieves the following results on the evaluation set:
24
 
25
  ## Model description
26
 
27
- More information needed
 
 
 
28
 
29
  ## Intended uses & limitations
30
 
31
- More information needed
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ## Training and evaluation data
34
 
35
- More information needed
 
 
 
 
 
 
36
 
37
  ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  ### Training hyperparameters
40
 
 
24
 
25
  ## Model description
26
 
27
+ - Source: Text (spoken text)
28
+ - Target: gloss (ArSL gloss)
29
+ - Domain: ArSL Friday sermon translation from text to gloss
30
+ We used a pre-trained model (apus_mt) for domain specification.
31
 
32
  ## Intended uses & limitations
33
 
34
+ - Data Specificity: The model is trained specifically on Arabic text and ArSL glosses. It may not perform well when applied to other languages or sign languages.
35
+
36
+ - Contextual Accuracy: While the model handles straightforward translations effectively, it might struggle with complex sentences or phrases that require a deep understanding of context, especially when combining or shuffling sentences.
37
+
38
+ - Generalization to Unseen Data: The model’s performance may degrade when exposed to text that significantly differs in style or content from the training data, such as highly specialized jargon or informal language.
39
+
40
+ - Gloss Representation: The model translates text into glosses, which are a written representation of sign language but do not capture the full complexity of sign language grammar and non-manual signals (facial expressions, body language).
41
+
42
+ - Test Dataset Limitations: The test dataset used is a shortened version of a sermon that does not cover all possible sentence structures and contexts, which may limit the model’s ability to generalize to other domains.
43
+
44
+ - Ethical Considerations: Care must be taken when deploying this model in real-world applications, as misinterpretations or inaccuracies in translation can lead to misunderstandings, especially in sensitive communications.
45
+
46
 
47
  ## Training and evaluation data
48
 
49
+ - Dataset size before augmentation: 131
50
+ - Dataset size after augmentation: 8646
51
+ - (For training and validation): Augmented Dataset Splitter:
52
+ - train: 7349
53
+ - validation: 1297
54
+ - (For testing): We used a dataset that contained the actual scenario of the Friday sermon phrases to generate a short Friday sermon.
55
+
56
 
57
  ## Training procedure
58
+ ## 1- Train and Evaluation Result:
59
+ - Train and Evaluation Loss: 0.464023
60
+ - Train and Evaluation Word BLEU Score: 97.08
61
+ - Train and Evaluation Char BLEU Score: 98.94
62
+ - Train and Evaluation Runtime (seconds): 562.8277
63
+ - Train and Evaluation Samples per Second: 391.718
64
+ - Train and Evaluation Steps per Second: 12.26
65
+ - Test Results:
66
+ ## 2- Test Loss: 0.289312
67
+ - Test Word BLEU Score: 76.92
68
+ - Test Char BLEU Score: 86.30
69
+ - Test Runtime (seconds): 1.1038
70
+ - Test Samples per Second: 41.67
71
+ - Test Steps per Second: 0.91
72
 
73
  ### Training hyperparameters
74