metadata
license: openrail
datasets:
- WelfCrozzo/kupalinka
language:
- be
- en
- ru
metrics:
- bleu
library_name: transformers
tags:
- translation
widget:
- text: <extra_id_1>да зорак праз цяжкасці
example_title: be -> ru
- text: <extra_id_2>да зорак праз цяжкасці
example_title: be -> en
- text: <extra_id_3>к звездам через трудности
example_title: ru -> be
- text: <extra_id_5>к звездам через трудности
example_title: ru -> en
- text: <extra_id_6>to the stars through difficulties.
example_title: en -> be
- text: <extra_id_7>to the stars through difficulties.
example_title: en -> ru
T5 for belarusian language
This model is based on T5-small with sequence length equal 128 tokens. Model trained from scratch on RTX 3090 24GB.
Supported tasks:
- translation BE to RU:
<extra_id_1>
- translation BE to EN:
<extra_id_2>
- translation RU to BE:
<extra_id_3>
- translation RU to EN:
<extra_id_5>
- translation EN to BE:
<extra_id_6>
- translation EN to RU:
<extra_id_7>
Metrics:
How to Get Started with the Model
Click to expand
from transformers import T5TokenizerFast, T5ForConditionalGeneration
tokenizer = T5TokenizerFast.from_pretrained("WelfCrozzo/T5-L128-belarusian")
model = T5ForConditionalGeneration.from_pretrained("WelfCrozzo/T5-L128-belarusian")
x = tokenizer.encode('<extra_id_1>да зорак праз цяжкасці', return_tensors='pt')
result = model.generate(x, return_dict_in_generate=True, output_scores=True,max_length=128)
print(tokenizer.decode(result["sequences"][0]))