metadata
license: apache-2.0
datasets:
- nastyboget/stackmix_hkr_large
- nastyboget/stackmix_cyrillic_large
- nastyboget/synthetic_cyrillic_large
language:
- ru
- en
pipeline_tag: image-to-text
tags:
- ocr
Model Card for TrOCR-Ru
Finetuned model microsoft/trocr-base-handwritten on large synth datasets from nastyboget.
Metrics on HKR/Cyrillic datasets
Metric | HKR_val | HKR_test1 | HKR_test2 | CYR_val | CYR_test |
---|---|---|---|---|---|
Accuracy | 69.9947 | 67.4184 | 69.9187 | 72.3613 | 63.9249 |
CER | 6.7964 | 8.9113 | 6.7278 | 6.6403 | 9.2576 |
WER | 21.6688 | 27.3849 | 21.6200 | 27.6715 | 33.2406 |
Last update form 29/02/2024