File size: 1,749 Bytes
fb32f25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0e354fb
fb32f25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

---
language:
- bg
- cs
- zh
- de
- fi
- fr
- ru
- es
tags:
- generation
- question answering
- instruction tuning
license: cc-by-nc-4.0
---

###  Model Description

This HF repository contains LLMs instruction tuned with full-parameter fine-tuning and then used to study whether monolingual or multilingual instruction tuning is more favourable.
* [GitHub](https://github.com/hplt-project/monolingual-multilingual-instruction-tuning/tree/main)
* [Paper](https://arxiv.org/abs/2309.08958)

#### Instruction tuning details
* Base model: [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b)
* Instruction tuning language: multilingual downsampled (Bulgarian, Czech, Chinese, German, Finnish, French, Russian, and Spanish)
* Training method: full-parameter fine-tuning.
* Best checkpoint: best cross-entropy on a validation set, trained for 3 epochs.
* Dataset: machine-translated from [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned). You can download our data [HERE](https://github.com/hplt-project/monolingual-multilingual-instruction-tuning/tree/main/training-data).

#### Usage
The model checkpoint should be loaded using `transformers` library.

Please refer to our Github repository [HERE](https://github.com/hplt-project/monolingual-multilingual-instruction-tuning/tree/main/fpft) for inference and training instructions.

#### Citation
```
@inproceedings{chen-etal-2024-monolingual,
  title="Monolingual or multilingual instruction tuning: Which makes a better {Alpaca}",
  author="Pinzhen Chen and Shaoxiong Ji and Nikolay Bogoychev and Andrey Kutuzov and Barry Haddow and Kenneth Heafield",
  year="2024",
  booktitle = "Findings of the Association for Computational Linguistics: EACL 2024",
}
```