PawanKrGunjan commited on
Commit
4743652
·
verified ·
1 Parent(s): f974d18

![__results___48_0.png](https://cdn-uploads.huggingface.co/production/uploads/646087d4846a6c8c83ff6f7b/9PckWn4CedbivO5l9I1oJ.png)

Files changed (1) hide show
  1. README.md +92 -29
README.md CHANGED
@@ -1,49 +1,112 @@
1
  ---
2
- base_model: microsoft/trocr-base-handwritten
 
 
 
3
  tags:
4
- - generated_from_trainer
 
 
 
 
 
 
 
 
5
  model-index:
6
  - name: license_plate_recognizer
7
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pawankrgunjan/huggingface/runs/3caod3mu)
14
- # license_plate_recognizer
15
 
16
- This model is a fine-tuned version of [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) on an unknown dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 0.0098
19
- - Cer: 0.0053
20
 
21
- ## Model description
 
 
 
22
 
23
- More information needed
 
 
24
 
25
- ## Intended uses & limitations
 
 
26
 
27
- More information needed
 
28
 
29
- ## Training and evaluation data
 
 
30
 
31
- More information needed
 
32
 
33
- ## Training procedure
34
 
35
- ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 2e-05
39
- - train_batch_size: 8
40
- - eval_batch_size: 8
41
- - seed: 42
42
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
- - lr_scheduler_type: linear
44
- - num_epochs: 20
45
 
46
- ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss | Cer |
49
  |:-------------:|:-----:|:----:|:---------------:|:------:|
@@ -69,9 +132,9 @@ The following hyperparameters were used during training:
69
  | 0.0001 | 20.0 | 7940 | 0.0047 | 0.0018 |
70
 
71
 
72
- ### Framework versions
73
 
74
  - Transformers 4.42.3
75
  - Pytorch 2.1.2
76
  - Datasets 2.20.0
77
- - Tokenizers 0.19.1
 
1
  ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: transformers
6
  tags:
7
+ - image-to-text
8
+ - license-plate-recognition
9
+ - ocr
10
+ - transformers
11
+ datasets:
12
+ - PawanKrGunjan/license_plates
13
+ metrics:
14
+ - cer
15
+ base_model: microsoft/trocr-base-handwritten
16
  model-index:
17
  - name: license_plate_recognizer
18
+ results:
19
+ - task:
20
+ type: image-to-text
21
+ name: License Plate Recognition
22
+ dataset:
23
+ type: PawanKrGunjan/license_plates
24
+ name: License Plates Dataset
25
+ config: default
26
+ split: validation
27
+ metrics:
28
+ - type: cer
29
+ value: 0.0036
30
+ name: Character Error Rate (CER)
31
+ pipeline_tag: image-to-text
32
  ---
33
 
34
+ # License Plate Recognizer
35
+
36
+ This model is a fine-tuned version of the [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) model, specifically designed for recognizing and extracting text from license plate images. It was trained on the [PawanKrGunjan/license_plates](https://huggingface.co/datasets/PawanKrGunjan/license_plates) dataset and is ideal for OCR tasks focused on license plates.
37
+
38
+ ## Model Description
39
+
40
+ ### TrOCR (base-sized model, fine-tuned on IAM)
41
+
42
+ This model is based on the TrOCR model, which was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. The original TrOCR model was first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr).
43
+
44
+ The TrOCR model utilizes an encoder-decoder architecture:
45
+ - **Encoder:** A Transformer-based image encoder initialized from BEiT weights.
46
+ - **Decoder:** A Transformer-based text decoder initialized from RoBERTa weights.
47
+
48
+ The model processes images as sequences of patches (16x16 resolution) and generates text autoregressively.
49
+
50
+ This version of TrOCR has been fine-tuned on the IAM dataset for improved performance in OCR tasks involving handwritten text, making it particularly effective for recognizing text in license plates.
51
+
52
+ ### Fine-Tuning Details
53
+
54
+ - **Base Model:** The model was fine-tuned from .[microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) model.
55
+ - **Dataset:** Fine-tuning was performed using the [PawanKrGunjan/license_plates](https://huggingface.co/datasets/PawanKrGunjan/license_plates) dataset.
56
+
57
+ ## Intended Uses & Limitations
58
+
59
+ ### Use Cases
60
+
61
+ - **License Plate Recognition:** Extract text from license plate images for use in various automated systems.
62
+ - **Automated Surveillance:** Suitable for integration into automated surveillance systems for real-time monitoring.
63
+
64
+ ### Limitations
65
+
66
+ - **Environmental Constraints:** Performance may degrade in low-light conditions or with low-resolution images.
67
+ - **Regional Variability:** The model may struggle with license plate designs that differ significantly from those in the training dataset.
68
 
69
+ # How to Use
 
70
 
71
+ Here’s an example of how to use the model in Python:
 
 
 
72
 
73
+ ```python
74
+ from transformers import TrOCRProcessor, VisionEncoderDecoderModel
75
+ from PIL import Image
76
+ import requests
77
 
78
+ # Load the processor and model
79
+ processor = TrOCRProcessor.from_pretrained("your-username/license_plate_recognizer")
80
+ model = VisionEncoderDecoderModel.from_pretrained("your-username/license_plate_recognizer")
81
 
82
+ # Load an image of a license plate
83
+ image_url = "https://datasets-server.huggingface.co/assets/PawanKrGunjan/license_plates/--/c1a289cb616808b2a834fae90d9625c2f78b82c9/--/default/train/34/image/image.jpg?Expires=1723689029&Signature=jlu~8q7l2MT2IhbS5UttYLkPaMX3416a9CByGBa9M5QKNqi9ezSTYLkDsliKKgO2c-TbiJ8LsEAOB8jmcXwQkN6eNBjrJpnyGqBZ7T99P-cXk5XwHiJa27bn6jINvBUBVID8ganhqBv-DubyyM4RcksxyjZNAE7yatBTBbaDk1-mno5pbL7fpFb~gHfMvMGalPWa-vO3teeoS0yHhp5yNzSjObmwzqn42bZpCFA3dleRPnzikyKPR3OzFK1BaPyr2bzJsLUlg3H7E8c3NGz~ryLjBREa2KpyM2X0JkhzvT0fEGsdaiyN36Tkqoi2aeH~KM8YzztD7W-jSH83dckdxw__&Key-Pair-Id=K3EI6M078Z3AC3"
84
+ image = Image.open(requests.get(image_url, stream=True).raw)
85
 
86
+ # Process the image
87
+ pixel_values = processor(image, return_tensors="pt").pixel_values
88
 
89
+ # Generate the text prediction
90
+ generated_ids = model.generate(pixel_values)
91
+ generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
92
 
93
+ print(generated_text)
94
+ ```
95
 
96
+ # Training procedure
97
 
98
+ ## Training hyperparameters
99
 
100
  The following hyperparameters were used during training:
101
+ - **learning_rate**: 2e-05
102
+ - **train_batch_size**: 8
103
+ - **eval_batch_size**: 8
104
+ - **seed**: 42
105
+ - **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
106
+ - **lr_scheduler_type**: linear
107
+ - **num_epochs**: 20
108
 
109
+ ## Training results
110
 
111
  | Training Loss | Epoch | Step | Validation Loss | Cer |
112
  |:-------------:|:-----:|:----:|:---------------:|:------:|
 
132
  | 0.0001 | 20.0 | 7940 | 0.0047 | 0.0018 |
133
 
134
 
135
+ ## Framework versions
136
 
137
  - Transformers 4.42.3
138
  - Pytorch 2.1.2
139
  - Datasets 2.20.0
140
+ - Tokenizers 0.19.1