This is a fine-tuned version of Multilingual Bart (610M) trained on German in particular on the public dataset Falko-MERLIN for Grammatical Error Correction.
To initialize the model:
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model = MBartForConditionalGeneration.from_pretrained("MRNH/mbart-german-grammar-corrector")
Use the tokenizer:
tokenizer = MBart50TokenizerFast.from_pretrained("MRNH/mbart-german-grammar-corrector", src_lang="de_DE", tgt_lang="de_DE")
input = tokenizer("I was here yesterday to studying",text_target="I was here yesterday to study", return_tensors='pt')
To generate text using the model:
output = model.generate(input["input_ids"],attention_mask=input["attention_mask"],forced_bos_token_id=tokenizer_it.lang_code_to_id["de_DE"])
Training of the model is performed using the following loss computation based on the hidden state output h:
h.logits, h.loss = model(input_ids=input["input_ids"],
attention_mask=input["attention_mask"],
labels=input["labels"])
- Downloads last month
- 100
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.