fhswf
/

TrOCR_german_handwritten

vision-encoder-decoder

image-text-to-text

Inference Endpoints

Model card Files Files and versions Community

TrOCR_german_handwritten / README.md

TGrote11's picture

Update README.md

96eaee8 verified 5 months ago

|

2.83 kB

	---
	library_name: transformers
	datasets:
	- fhswf/german_handwriting
	language:
	- de
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->



	## Model Details


	<!-- Provide a longer summary of what this model is. -->

	TrOCR model fine-tuned on the [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). It was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. and first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr).

	- Developed by: [More Information Needed]
	- Model type: Transformer OCR
	- Language(s) (NLP): German
	- License: [More Information Needed]
	- Finetuned from model [optional]: [TrOCR_large_handwritten](https://huggingface.co/microsoft/trocr-large-handwritten)


	## Uses

	Here is how to use this model in PyTorch:

	```python
	from transformers import TrOCRProcessor, VisionEncoderDecoderModel
	from PIL import Image
	import requests
	# load image from the IAM database
	url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
	image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
	processor = TrOCRProcessor.from_pretrained('TGrote11/testModel')
	model = VisionEncoderDecoderModel.from_pretrained('TGrote11/testModel')
	pixel_values = processor(images=image, return_tensors="pt").pixel_values
	generated_ids = model.generate(pixel_values)
	generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## Bias, Risks, and Limitations

	You can use the raw model for optical character recognition (OCR) on single text-line images of german handwriting.



	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	This model was finetuned on [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting).

	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->


	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->
	Levenshtein: 1.85 \\
	WER (Word Error Rate): 17.5% \\
	CER (Character Error Rate): 4.1%




	BibTeX:

	```bibtex
	@misc{li2021trocr,
	title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models},
	author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
	year={2021},
	eprint={2109.10282},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```