sappho192's picture
Add information on using dataset outside of Korea
a547c2c verified
|
raw
history blame
2.49 kB
metadata
license: mit
language:
  - ja
  - ko
pipeline_tag: translation

Japanese to Korean translator

This model used datasets from 'The Open AI Dataset Project (AI-Hub, South Korea)'.
All data information can be accessed through 'AI-Hub (aihub.or.kr)'.
(In order for a corporation, organization, or individual located outside of Korea to use AI data, etc., a separate agreement is required with the performing organization and the Korea National Information Society agency(NIA). In order to export AI data, etc. outside the country, a separate agreement is required with the performing organization and the NIA. Link)

์ด ๋ชจ๋ธ์€ ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€์˜ ์žฌ์›์œผ๋กœ ํ•œ๊ตญ์ง€๋Šฅ์ •๋ณด์‚ฌํšŒ์ง„ํฅ์›์˜ ์ง€์›์„ ๋ฐ›์•„ ๊ตฌ์ถ•๋œ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์—ฌ ์ˆ˜ํ–‰๋œ ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.
๋ณธ ๋ชจ๋ธ์— ํ™œ์šฉ๋œ ๋ฐ์ดํ„ฐ๋Š” AI ํ—ˆ๋ธŒ(aihub.or.kr)์—์„œ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์œผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
(๊ตญ์™ธ์— ์†Œ์žฌํ•˜๋Š” ๋ฒ•์ธ, ๋‹จ์ฒด ๋˜๋Š” ๊ฐœ์ธ์ด AI๋ฐ์ดํ„ฐ ๋“ฑ์„ ์ด์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ˆ˜ํ–‰๊ธฐ๊ด€ ๋“ฑ ๋ฐ ํ•œ๊ตญ์ง€๋Šฅ์ •๋ณด์‚ฌํšŒ์ง„ํฅ์›๊ณผ ๋ณ„๋„๋กœ ํ•ฉ์˜๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๋ณธ AI๋ฐ์ดํ„ฐ ๋“ฑ์˜ ๊ตญ์™ธ ๋ฐ˜์ถœ์„ ์œ„ํ•ด์„œ๋Š” ์ˆ˜ํ–‰๊ธฐ๊ด€ ๋“ฑ ๋ฐ ํ•œ๊ตญ์ง€๋Šฅ์ •๋ณด์‚ฌํšŒ์ง„ํฅ์›๊ณผ ๋ณ„๋„๋กœ ํ•ฉ์˜๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. [์ถœ์ฒ˜])

Dataset list

The dataset used to train the model is merged following sub-datasets:

    1. ์ผ์ƒ์ƒํ™œ ๋ฐ ๊ตฌ์–ด์ฒด ํ•œ-์ค‘, ํ•œ-์ผ ๋ฒˆ์—ญ ๋ณ‘๋ ฌ ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ [Link]
    1. ํ•œ๊ตญ์–ด-๋‹ค๊ตญ์–ด(์˜์–ด ์ œ์™ธ) ๋ฒˆ์—ญ ๋ง๋ญ‰์น˜(๊ธฐ์ˆ ๊ณผํ•™) [Link]
    1. ํ•œ๊ตญ์–ด-๋‹ค๊ตญ์–ด ๋ฒˆ์—ญ ๋ง๋ญ‰์น˜(๊ธฐ์ดˆ๊ณผํ•™) [Link]
    1. ํ•œ๊ตญ์–ด-๋‹ค๊ตญ์–ด ๋ฒˆ์—ญ ๋ง๋ญ‰์น˜ (์ธ๋ฌธํ•™) [Link]
  • ํ•œ๊ตญ์–ด-์ผ๋ณธ์–ด ๋ฒˆ์—ญ ๋ง๋ญ‰์น˜ [Link]

To reproduce the the merged dataset, you can use the code in below link:
https://github.com/sappho192/aihub-translation-dataset