|
--- |
|
library_name: transformers |
|
datasets: |
|
- fhswf/german_handwriting |
|
language: |
|
- de |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
TrOCR model fine-tuned on the [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). It was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. and first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr). |
|
|
|
- **Developed by:** [More Information Needed] |
|
- **Model type:** Transformer OCR |
|
- **Language(s) (NLP):** German |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [TrOCR_large_handwritten](https://huggingface.co/microsoft/trocr-large-handwritten) |
|
|
|
|
|
## Uses |
|
|
|
Here is how to use this model in PyTorch: |
|
|
|
```python |
|
from transformers import TrOCRProcessor, VisionEncoderDecoderModel |
|
from PIL import Image |
|
import requests |
|
# load image from the IAM database |
|
url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg' |
|
image = Image.open(requests.get(url, stream=True).raw).convert("RGB") |
|
processor = TrOCRProcessor.from_pretrained('TGrote11/testModel') |
|
model = VisionEncoderDecoderModel.from_pretrained('TGrote11/testModel') |
|
pixel_values = processor(images=image, return_tensors="pt").pixel_values |
|
generated_ids = model.generate(pixel_values) |
|
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
``` |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
You can use the raw model for optical character recognition (OCR) on single text-line images of german handwriting. |
|
|
|
|
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
This model was finetuned on [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
Levenshtein: 1.85 \\ |
|
WER (Word Error Rate): 17.5% \\ |
|
CER (Character Error Rate): 4.1% |
|
|
|
|
|
|
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@misc{li2021trocr, |
|
title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models}, |
|
author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei}, |
|
year={2021}, |
|
eprint={2109.10282}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |