Translation
Transformers
Safetensors
French
Breton
m2m_100
text2text-generation
Inference Endpoints
amurienne commited on
Commit
7711cbc
·
verified ·
1 Parent(s): 9d2905a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -3
README.md CHANGED
@@ -1,3 +1,41 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - Bretagne/ofis_publik_br-fr
5
+ - Bretagne/OpenSubtitles_br_fr
6
+ - Bretagne/Autogramm_Breton_translation
7
+ language:
8
+ - fr
9
+ - br
10
+ base_model:
11
+ - facebook/m2m100_418M
12
+ pipeline_tag: translation
13
+ library_name: transformers
14
+ ---
15
+
16
+ # Kellag
17
+
18
+ * A Breton -> French Translation Model called **Kellag**.
19
+ * Kellag is the temporary "brother" model of [Gallek](https://huggingface.co/amurienne/gallek-m2m100), since a bidirectional fr <-> br model is not ready yet (WIP).
20
+ * The current model version reached a **BLEU score of 50** after 10 epochs on a 20% split of the training set.
21
+ * Only monodirectionally br->fr fine-tuned for now.
22
+ * Training details available on the [GweLLM Github repository](https://github.com/blackccpie/GweLLM).
23
+
24
+ Sample test code:
25
+ ```python
26
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
27
+
28
+ modelcard = "amurienne/kellag-m2m100"
29
+
30
+ model = AutoModelForSeq2SeqLM.from_pretrained(modelcard)
31
+ tokenizer = AutoTokenizer.from_pretrained(modelcard)
32
+
33
+ translation_pipeline = pipeline("translation", model=model, tokenizer=tokenizer, src_lang='br', tgt_lang='fr', max_length=512, device="cpu")
34
+
35
+ breton_text = "treiñ eus ar brezhoneg d'ar galleg: deskiñ a ran brezhoneg er skol."
36
+
37
+ result = translation_pipeline(breton_text)
38
+ print(result[0]['translation_text'])
39
+ ```
40
+
41
+ Demo is available on the [Gallek Space](https://huggingface.co/spaces/amurienne/Gallek)