File size: 912 Bytes
cd6e365
 
ae46abe
 
 
 
 
 
 
 
 
 
cd6e365
ae46abe
 
 
 
 
fa1f87d
f981e8b
 
6c671b2
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
license: apache-2.0
datasets:
- nastyboget/stackmix_hkr_large
- nastyboget/stackmix_cyrillic_large
- nastyboget/synthetic_cyrillic_large
language:
- ru
- en
pipeline_tag: image-to-text
tags:
- ocr
---

# Model Card for TrOCR-Ru

<!-- Provide a quick summary of what the model is/does. -->

Finetuned model [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) on large synth datasets from [nastyboget](https://huggingface.co/nastyboget).

## Metrics on HKR/Cyrillic datasets 

|  Metric  | HKR_val   | HKR_test1 | HKR_test2 | CYR_val   | CYR_test  |
|:--------:|:---------:|:---------:|:---------:|:---------:|:---------:|
| Accuracy | 69.9947   | 67.4184   | 69.9187   | 72.3613   | 63.9249   |
| CER      | 6.7964    | 8.9113    | 6.7278    | 6.6403    | 9.2576    |
| WER      | 21.6688   | 27.3849   | 21.6200   | 27.6715   | 33.2406   |

Last update form 29/02/2024