--- license: apache-2.0 datasets: - nastyboget/stackmix_hkr_large - nastyboget/stackmix_cyrillic_large - nastyboget/synthetic_cyrillic_large language: - ru - en pipeline_tag: image-to-text tags: - ocr --- # Model Card for TrOCR-Ru Finetuned model [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) on large synth datasets from [nastyboget](https://huggingface.co/nastyboget). ## Metrics on HKR/Cyrillic datasets | Metric | HKR_val | HKR_test1 | HKR_test2 | CYR_val | CYR_test | |:--------:|:---------:|:---------:|:---------:|:---------:|:---------:| | Accuracy | 69.9947 | 67.4184 | 69.9187 | 72.3613 | 63.9249 | | CER | 6.7964 | 8.9113 | 6.7278 | 6.6403 | 9.2576 | | WER | 21.6688 | 27.3849 | 21.6200 | 27.6715 | 33.2406 | Last update form 29/02/2024