File size: 912 Bytes
cd6e365 ae46abe cd6e365 ae46abe fa1f87d f981e8b 6c671b2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
license: apache-2.0
datasets:
- nastyboget/stackmix_hkr_large
- nastyboget/stackmix_cyrillic_large
- nastyboget/synthetic_cyrillic_large
language:
- ru
- en
pipeline_tag: image-to-text
tags:
- ocr
---
# Model Card for TrOCR-Ru
<!-- Provide a quick summary of what the model is/does. -->
Finetuned model [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) on large synth datasets from [nastyboget](https://huggingface.co/nastyboget).
## Metrics on HKR/Cyrillic datasets
| Metric | HKR_val | HKR_test1 | HKR_test2 | CYR_val | CYR_test |
|:--------:|:---------:|:---------:|:---------:|:---------:|:---------:|
| Accuracy | 69.9947 | 67.4184 | 69.9187 | 72.3613 | 63.9249 |
| CER | 6.7964 | 8.9113 | 6.7278 | 6.6403 | 9.2576 |
| WER | 21.6688 | 27.3849 | 21.6200 | 27.6715 | 33.2406 |
Last update form 29/02/2024 |