
Ekogenie
company
AI & ML interests
None defined yet.
Recent Activity
Organization Card
Dataset Source:
- Original Source: The English sentences were sourced from oubic domain sources such as https://www.gutenberg.org/ .
- Translation Tool: Google Translate was used for translating the sentences from English to Yoruba.
Dataset Format:
- english: The original English sentence.
- yoruba: The Yoruba translation of the sentence.
- source: the source of the English sentences.
Example:
en | yo | source |
---|---|---|
The subconscious offensiveness of their attitude has constituted old Jolyon's 'home' the psychological moment of the family history, made it the prelude of their drama. | Iwa ibinu èroÅ„gbà ti iá¹£esi wá»n ti jẹ “ile†atijá» ti Jolyon ni akoko imá»-jinlẹ ti itan-aká»á»lẹ ẹbi, jẹ ki o jẹ iá¹£aaju ti eré wá»n. | https://www.gutenberg.org/ebooks/2559.txt.utf-8 |
The Forsytes were resentful of something, not individually, but as a family; this resentment expressed itself in an added perfection of raiment, an exuberance of family cordiality, an exaggeration of family importance, and--the sniff. | Awá»n Forsytes binu si nkan kan, kii á¹£e olukuluku, á¹£ugbá»n gẹgẹbi idile; ibinu yii á¹£e afihan ararẹ ni pipe ti aṣỠti a fi kun, igbadun ti ifarabalẹ idile, iá¹£aju ti pataki idile, ati --ifun. | https://www.gutenberg.org/ebooks/2559.txt.utf-8 |
Danger--so indispensable in bringing out the fundamental quality of any society, group, or individual--was what the Forsytes scented; the premonition of danger put a burnish on their armour. | Ewu - nitorinaa ko á¹£e pataki lati mu didara ipilẹ ti awujá», ẹgbẹ, tabi ẹni ká»á»kan jade - jẹ ohun ti awá»n Forsytes rùn; premonition ti ewu fi kan iná lori wá»n ihamá»ra. | https://www.gutenberg.org/ebooks/2559.txt.utf-8 |
Dataset Size:
- Number of Entries: 520,000
Usage:
This dataset can be used for:
- Training machine translation models for Yoruba.
- Analyzing translation quality and limitations in automated tools.
- Supporting linguistic research and NLP projects for low-resource languages.
Limitations and Considerations:
- Quality of Translations: As translations were generated using Google Translate, some sentences may not reflect perfect accuracy. Manual validation is recommended for critical applications.
- Cultural and Contextual Nuances: Machine translations might miss idiomatic expressions or cultural nuances present in the source language.
- Biases: Any biases inherent in Google Translate's model may propagate into this dataset.
Licensing:
Source Material License: Public Domain
Tags:
machine-translation
speech-to-text
yoruba-language
african-languages
Task_categories:
text-classification
machine-translation
models
None public yet
datasets
None public yet