[ss-en] Siswati to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus

Model created from Siswati to English aligned sentences from The South African Gov-ZA multilingual corpus

The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements

Authors

  • Vukosi Marivate - @vukosi
  • Matimba Shingange
  • Richard Lastrucci
  • Isheanesu Joseph Dzingirai
  • Jenalea Rajab

BibTeX entry and citation info

@inproceedings{lastrucci-etal-2023-preparing,
    title = "Preparing the Vuk{'}uzenzele and {ZA}-gov-multilingual {S}outh {A}frican multilingual corpora",
    author = "Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate",
    booktitle = "Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)",
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.rail-1.3",
    pages = "18--25"
}

Paper - Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora

Downloads last month
25
Safetensors
Model size
486M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.