File size: 2,723 Bytes
08664be 9e9ba0f 08664be 96eaee8 241262e 15e7972 08664be f43d883 08664be 0065e6f 08664be 0065e6f 37fdea4 6d7c43b 37fdea4 08664be 37fdea4 684a8f5 37fdea4 08664be 37fdea4 08664be 0065e6f 08664be 353c6f9 96eaee8 08664be 0065e6f 96eaee8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
---
language:
- de
library_name: transformers
datasets:
- fhswf/german_handwriting
license: afl-3.0
pipeline_tag: image-to-text
---
# Model Card for TrOCR_german_handwritten
<!-- Provide a quick summary of what the model is/does. -->
## Model Details
<!-- Provide a longer summary of what this model is. -->
TrOCR model fine-tuned on the [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). It was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. and first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr).
- **Developed by:** [More Information Needed]
- **Model type:** Transformer OCR
- **Language(s) (NLP):** German
- **License:** afl-3.0
- **Finetuned from model [optional]:** [TrOCR_large_handwritten](https://huggingface.co/microsoft/trocr-large-handwritten)
## Uses
Here is how to use this model in PyTorch:
```python
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import requests
# load image from the IAM database
url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
processor = TrOCRProcessor.from_pretrained('fhswf/TrOCR_german_handwritten')
model = VisionEncoderDecoderModel.from_pretrained('fhswf/TrOCR_german_handwritten')
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
## Bias, Risks, and Limitations
You can use the raw model for optical character recognition (OCR) on single text-line images of german handwriting.
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
This model was finetuned on [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting).
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
Levenshtein: 1.85 <br>
WER (Word Error Rate): 17.5% <br>
CER (Character Error Rate): 4.1%
**BibTeX:**
```bibtex
@misc{li2021trocr,
title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models},
author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
year={2021},
eprint={2109.10282},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
``` |