|
# π KoChatBART |
|
[**BART**](https://arxiv.org/pdf/1910.13461.pdf)(**B**idirectional and **A**uto-**R**egressive **T**ransformers)λ μ
λ ₯ ν
μ€νΈ μΌλΆμ λ
Έμ΄μ¦λ₯Ό μΆκ°νμ¬ μ΄λ₯Ό λ€μ μλ¬ΈμΌλ‘ 볡ꡬνλ `autoencoder`μ ννλ‘ νμ΅μ΄ λ©λλ€. νκ΅μ΄ μ±ν
BART(μ΄ν **KoChatBART**) λ λ
Όλ¬Έμμ μ¬μ©λ `Text Infilling` λ
Έμ΄μ¦ ν¨μλ₯Ό μ¬μ©νμ¬ μ½ **10GB** μ΄μμ νκ΅μ΄ λν ν
μ€νΈμ λν΄μ νμ΅ν νκ΅μ΄ `encoder-decoder` μΈμ΄ λͺ¨λΈμ
λλ€. μ΄λ₯Ό ν΅ν΄ λμΆλ λν μμ±μ κ°κ±΄ν `KoChatBART-base`λ₯Ό λ°°ν¬ν©λλ€. |
|
|
|
<img src=https://user-images.githubusercontent.com/55969260/205434343-b72641e9-d0f9-4b88-a334-9f904e0a35c5.png> |
|
|
|
## Quick tour |
|
```python |
|
from transformers import AutoTokenizer, BartForConditionalGeneration |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("BM-K/KoChatBART") |
|
model = BartForConditionalGeneration.from_pretrained("BM-K/KoChatBART") |
|
|
|
inputs = tokenizer("μλ
μΈμμ!", return_tensors="pt") |
|
outputs = model(**inputs) |
|
``` |
|
|
|
## μ¬μ νμ΅ λ°μ΄ν° μ μ²λ¦¬ |
|
μ¬μ©ν λ°μ΄ν°μ
|
|
- [μ£Όμ λ³ ν
μ€νΈ μΌμ λν λ°μ΄ν°](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=543) |
|
- [μμκ³΅μΈ κ³ κ° μ£Όλ¬Έ μ§μ-μλ΅ ν
μ€νΈ](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=102) |
|
- [νκ΅μ΄ SNS](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=114) |
|
- [λ―Όμ μ
무 μλν μΈκ³΅μ§λ₯ μΈμ΄ λ°μ΄ν°](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=619) |
|
|
|
KoChatBARTλ₯Ό νμ΅μν€κΈ° μνμ¬ νκ΅μ΄ λν λ°μ΄ν°μ
λ€μ μ μ²λ¦¬ ν ν©μ³ λλμ νκ΅μ΄ λν λ§λμΉλ₯Ό λ§λ€μμ΅λλ€. |
|
1. λ°μ΄ν°μ μ€λ³΅μ μ€μ΄κΈ° μν΄ 'γ
γ
γ
γ
γ
γ
'μ κ°μ μ€λ³΅λ ννμ΄ 2λ² μ΄μ λ°λ³΅λ λλ 'γ
γ
'μ κ°μ΄ 2λ²μΌλ‘ λ°κΏ¨μ΅λλ€. |
|
2. λ무 짧μ λ°μ΄ν°λ νμ΅μ λ°©ν΄κ° λ μ μκΈ° λλ¬Έμ KoBART ν ν¬λμ΄μ κΈ°μ€ μ 체 ν ν° κΈΈμ΄κ° 3μ λλ λ°μ΄ν°λ§μ μ λ³νμ΅λλ€. |
|
3. κ°λͺ
μ²λ¦¬λ λ°μ΄ν°λ μ κ±°νμμ΅λλ€. |
|
|
|
## Model |
|
|
|
| Model | # of params | vocab size | Type | # of layers | # of heads | ffn_dim | hidden_dims | |
|
| ------------- | :---------: | :-----: | :----------: | ---------: | ------: | ----------: | ----------: | |
|
| `KoChatBART` | 139M | 50265 | Encoder | 6 | 16 | 3072 | 768 | |
|
| | | | Decoder | 6 | 16 | 3072 | 768 | |
|
|
|
## λν μμ± μ±λ₯ μΈ‘μ |
|
λ€μ μ½λ[(Dialogue Generator)](https://github.com/2unju/KoBART_Dialogue_Generator)λ₯Ό κΈ°λ°μΌλ‘ κ° λͺ¨λΈμ fine-tuning νμμ΅λλ€. λν μμ± μ±λ₯ μΈ‘μ μ μν΄ μΆλ‘ μ ν ν¬λμ΄μ§λμ΄ μμ±λ μλ΅μ 볡μν ν, BPE tokenizerλ₯Ό μ¬μ©νμ¬ μ€μ μλ΅κ³Ό μμ±λ μλ΅ μ¬μ΄μ overlap λ° distinctλ₯Ό μΈ‘μ νμμ΅λλ€. |
|
> **Warning** <br> |
|
> μΌλ°μ μΌλ‘ 짧μ λν λ°μ΄ν°λ‘ λͺ¨λΈμ μ¬μ νμ΅νμκΈ° λλ¬Έμ κΈ΄ λ¬Έμ₯ μ²λ¦¬κ° μꡬλλ νμ€ν¬(μμ½) λ±μ λν΄μλ μ½ν λͺ¨μ΅μ 보μ
λλ€. |
|
|
|
### μ€ν κ²°κ³Ό |
|
- [κ°μ± λν λ°μ΄ν°](https://github.com/songys/Chatbot_data) |
|
|
|
|Training|Validation|Test| |
|
|:----:|:----:|:----:| |
|
|9,458|1,182|1,183| |
|
|
|
| Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 | |
|
|------------------------|:----:|:----:|:----:|:----:|:----:| |
|
| KoBART | 124M | 8.73 | 7.12 | 16.85 | 34.89 | |
|
| KoChatBART | 139M | **12.97** | **11.23** | **19.64** | **44.53** | |
|
| KoT5-ETRI | 324M | 12.10 | 10.14 | 16.97 | 40.09 | |
|
|
|
- [μμκ³΅μΈ λν λ°μ΄ν°](https://github.com/2unju/AIHub_Chitchat_dataset_parser) |
|
|
|
|Training|Validation|Test| |
|
|:----:|:----:|:----:| |
|
|29,093|1,616|1,616| |
|
|
|
| Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 | |
|
|------------------------|:----:|:----:|:----:|:----:|:----:| |
|
| KoBART | 124M | 10.04 | 7.24 | 13.76| 42.09 | |
|
| KoChatBART | 139M | **10.11** | **7.26** | **15.12** | **46.08** | |
|
| KoT5-ETRI | 324M | 9.45 | 6.66 | 14.50 | 45.46 | |
|
|
|
## Contributors |
|
<a href="https://github.com/BM-K/KoChatBART/graphs/contributors"> |
|
<img src="https://contrib.rocks/image?repo=BM-K/KoChatBART" /> |
|
</a> |
|
|
|
## Reference |
|
- [KoBART](https://github.com/SKT-AI/KoBART) |
|
|