Safetensors
llama
jisx commited on
Commit
b87ec2a
1 Parent(s): 6e295ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -9
README.md CHANGED
@@ -1,11 +1,11 @@
1
- ---
2
- license: llama2
3
- datasets:
4
- - MaLA-LM/PolyWrite
5
- - Davlan/sib200
6
- base_model:
7
- - meta-llama/Llama-2-7b-hf
8
- ---
9
 
10
  # EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
11
 
@@ -61,4 +61,11 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
61
 
62
  Challenges remain in low-resource languages, where the model tends to have higher **Self-BLEU** scores, indicating reduced output diversity.
63
 
64
- ---
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ datasets:
4
+ - MaLA-LM/PolyWrite
5
+ - Davlan/sib200
6
+ base_model:
7
+ - meta-llama/Llama-2-7b-hf
8
+ ---
9
 
10
  # EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
11
 
 
61
 
62
  Challenges remain in low-resource languages, where the model tends to have higher **Self-BLEU** scores, indicating reduced output diversity.
63
 
64
+ ---
65
+
66
+
67
+ ## Acknowledgements
68
+
69
+ We extend our thanks to the language communities and contributors who helped source, clean, and validate the diverse data used in the MaLA Corpus. Their efforts are invaluable in supporting linguistic diversity in AI research.
70
+
71
+ This work is created by researchers at [Helsinki-NLP](https://huggingface.co/Helsinki-NLP) in collaboration with partners from TU Darmstadt, the University of Edinburgh, and LMU Munich. It is funded by [HPLT](https://hplt-project.org) and [UTTER](https://he-utter.eu).