billingsmoore
/

phonetic-tibetan-to-english-translation

text2text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

billingsmoore commited on Aug 19

Commit

a89f493

•

1 Parent(s): 85ab26c

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ The model's name is taken from 'machine learning' (ML) and translation (Tibetan:
 The model expects Tibetan transliterated according to THL Simplified Phonetic Transliteration as an input and outputs an English translation.
-The model was evaluated using the BLEU metric, with a final score of 83.4374 on evaluation data.
 However, this score is unusually high, and may be the result of testing error. Stricter evaluation
 and training are currently in progress.
@@ -246,7 +246,8 @@ This model was trained for 6 epochs on the dataset described above.
 ## Evaluation
-The evaluation metric for this model was the BLEU score. BLEU (Bilingual Evaluation Understudy) scores measure the quality of
 machine-generated translations by comparing them to human-provided reference translations. The score ranges from 0 to 100,
 where 100 represents a perfect match with the reference translations. It evaluates the precision of n-grams (word sequences)
 in the generated text, with higher scores indicating closer alignment to the reference translations. A brevity penalty is applied

 The model expects Tibetan transliterated according to THL Simplified Phonetic Transliteration as an input and outputs an English translation.
+The model was evaluated using the BLEU metric as implemented by [sacreBLEU](https://pypi.org/project/sacrebleu/), with a final score of 83.4374 on evaluation data.
 However, this score is unusually high, and may be the result of testing error. Stricter evaluation
 and training are currently in progress.
 ## Evaluation
+The evaluation metric for this model was the BLEU score as implemented by [sacreBLEU](https://pypi.org/project/sacrebleu/).
+BLEU (Bilingual Evaluation Understudy) scores measure the quality of
 machine-generated translations by comparing them to human-provided reference translations. The score ranges from 0 to 100,
 where 100 represents a perfect match with the reference translations. It evaluates the precision of n-grams (word sequences)
 in the generated text, with higher scores indicating closer alignment to the reference translations. A brevity penalty is applied