billingsmoore commited on
Commit
a89f493
1 Parent(s): 85ab26c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -11,7 +11,7 @@ The model's name is taken from 'machine learning' (ML) and translation (Tibetan:
11
 
12
  The model expects Tibetan transliterated according to THL Simplified Phonetic Transliteration as an input and outputs an English translation.
13
 
14
- The model was evaluated using the BLEU metric, with a final score of 83.4374 on evaluation data.
15
  However, this score is unusually high, and may be the result of testing error. Stricter evaluation
16
  and training are currently in progress.
17
 
@@ -246,7 +246,8 @@ This model was trained for 6 epochs on the dataset described above.
246
 
247
  ## Evaluation
248
 
249
- The evaluation metric for this model was the BLEU score. BLEU (Bilingual Evaluation Understudy) scores measure the quality of
 
250
  machine-generated translations by comparing them to human-provided reference translations. The score ranges from 0 to 100,
251
  where 100 represents a perfect match with the reference translations. It evaluates the precision of n-grams (word sequences)
252
  in the generated text, with higher scores indicating closer alignment to the reference translations. A brevity penalty is applied
 
11
 
12
  The model expects Tibetan transliterated according to THL Simplified Phonetic Transliteration as an input and outputs an English translation.
13
 
14
+ The model was evaluated using the BLEU metric as implemented by [sacreBLEU](https://pypi.org/project/sacrebleu/), with a final score of 83.4374 on evaluation data.
15
  However, this score is unusually high, and may be the result of testing error. Stricter evaluation
16
  and training are currently in progress.
17
 
 
246
 
247
  ## Evaluation
248
 
249
+ The evaluation metric for this model was the BLEU score as implemented by [sacreBLEU](https://pypi.org/project/sacrebleu/).
250
+ BLEU (Bilingual Evaluation Understudy) scores measure the quality of
251
  machine-generated translations by comparing them to human-provided reference translations. The score ranges from 0 to 100,
252
  where 100 represents a perfect match with the reference translations. It evaluates the precision of n-grams (word sequences)
253
  in the generated text, with higher scores indicating closer alignment to the reference translations. A brevity penalty is applied