fra-eng

  • source language name: French

  • target language name: English

  • OPUS readme: README.md

  • model: transformer-align

  • source language code: fr

  • target language code: en

  • dataset: opus

  • release date: 2021-02-22

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights: opus-2021-02-22.zip

  • Training data:

    • fra-eng: Tatoeba-train (180923857)
  • Validation data:

    • eng-fra: Tatoeba-dev, 250098
    • total-size-shuffled: 249757
    • devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
  • Test data:

    • newsdiscussdev2015-enfr.fra-eng: 1500/27759
    • newsdiscusstest2015-enfr.fra-eng: 1500/26995
    • newssyscomb2009.fra-eng: 502/11821
    • news-test2008.fra-eng: 2051/49380
    • newstest2009.fra-eng: 2525/65402
    • newstest2010.fra-eng: 2489/61724
    • newstest2011.fra-eng: 3003/74681
    • newstest2012.fra-eng: 3003/72812
    • newstest2013.fra-eng: 3000/64505
    • newstest2014-fren.fra-eng: 3003/70708
    • Tatoeba-test.fra-eng: 10000/77174
  • test set translations file: test.txt

  • test set scores file: eval.txt

  • BLEU-scores

    Test set score
    Tatoeba-test.fra-eng 57.8
    newsdiscusstest2015-enfr.fra-eng 39.7
    newstest2014-fren.fra-eng 38.4
    newsdiscussdev2015-enfr.fra-eng 34.4
    newstest2013.fra-eng 34.0
    newstest2012.fra-eng 33.2
    newstest2011.fra-eng 33.1
    newstest2010.fra-eng 32.7
    newssyscomb2009.fra-eng 31.1
    newstest2009.fra-eng 30.5
    news-test2008.fra-eng 26.5
  • chr-F-scores

    Test set score
    Tatoeba-test.fra-eng 0.723
    newstest2014-fren.fra-eng 0.636
    newsdiscusstest2015-enfr.fra-eng 0.621
    newstest2011.fra-eng 0.598
    newstest2010.fra-eng 0.593
    newstest2012.fra-eng 0.593
    newstest2013.fra-eng 0.592
    newsdiscussdev2015-enfr.fra-eng 0.587
    newssyscomb2009.fra-eng 0.575
    newstest2009.fra-eng 0.572
    news-test2008.fra-eng 0.544

System Info:

  • hf_name: fra-eng
  • source_languages: fr
  • target_languages: en
  • opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/README.md
  • original_repo: Tatoeba-Challenge
  • tags: ['translation']
  • languages: ['fr', 'en']
  • src_constituents: ['fra']
  • tgt_constituents: ['eng']
  • src_multilingual: False
  • tgt_multilingual: False
  • helsinki_git_sha: 6faf2dab0b7b01a0e08a114dbacbb7deac54988d
  • transformers_git_sha: e9a6c72b5edfb9561a981959b0e7c62d8ab9ef6c
  • port_machine: 146-193-182-187.edr.inesc.pt
  • port_time: 2023-11-06-16:20
Downloads last month
25
Safetensors
Model size
111M params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.