File size: 2,154 Bytes
b4a5c42 eba9b9d 9da0500 eba9b9d 1a6780e eba9b9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
language:
- ko
tags:
- ocr
widget:
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg
example_title: word1
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/khs.jpg
example_title: word2
- src: https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/m.jpg
example_title: word3
pipeline_tag: image-to-text
---
# korean trocr model
- trocr λͺ¨λΈμ λμ½λμ ν ν¬λμ΄μ μ μλ κΈμλ ocr νμ§ λͺ»νκΈ° λλ¬Έμ, μ΄μ±μ μ¬μ©νλ ν ν¬λμ΄μ λ₯Ό μ¬μ©νλ λμ½λ λͺ¨λΈμ μ¬μ©νμ¬ μ΄μ±λ UNKλ‘ λμ€μ§ μκ² λ§λ trocr λͺ¨λΈμ
λλ€.
- [2023 κ΅μκ·Έλ£Ή AI OCR μ±λ¦°μ§](https://dacon.io/competitions/official/236042/overview/description) μμ μ»μλ λ
Ένμ°λ₯Ό νμ©νμ¬ μ μνμμ΅λλ€.
## train datasets
AI Hub
- [λ€μν ννμ νκΈ λ¬Έμ OCR](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=91)
- [곡곡νμ λ¬Έμ OCR](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=88)
## model structure
- encoder : [trocr-base-stage1's encoder](https://huggingface.co/microsoft/trocr-base-stage1)
- decoder : [KR-BERT-char16424](https://huggingface.co/snunlp/KR-BERT-char16424)
## how to use
```python
from transformers import TrOCRProcessor, VisionEncoderDecoderModel, AutoTokenizer
import requests
import unicodedata
from io import BytesIO
from PIL import Image
processor = TrOCRProcessor.from_pretrained("ddobokki/ko-trocr")
model = VisionEncoderDecoderModel.from_pretrained("ddobokki/ko-trocr")
tokenizer = AutoTokenizer.from_pretrained("ddobokki/ko-trocr")
url = "https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
pixel_values = processor(img, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values, max_length=64)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
generated_text = unicodedata.normalize("NFC", generated_text)
print(generated_text)
``` |