OCR-IITRoorkie / README.md
ayanika02's picture
Update README.md
dba8012 verified
|
raw
history blame
1.55 kB
---
title: OCR IITRoorkie
emoji: πŸ“š
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# OCR and Keyword Search Web Application
This web application performs Optical Character Recognition (OCR) on uploaded images containing text in both Hindi and English,
and provides a keyword search functionality.
## Setup
1. Install the required dependencies:
pip install -r requirements.txt
This contains crucial libraries like transformers, gradio, pillow, tesseract, pytesseract
2. Install Tesseract OCR:
For Windows,
Download and install from https://github.com/UB-Mannheim/tesseract/wiki
3. Update the tesseract path in script (this was not needed while deploying to Hugging Face Space but had to use it while running it locally on my machine)
## Running Locally
To run the application locally:
python app.py
## Deployment
To deploy on Hugging Face Spaces:
1. Created a new Space on Hugging Face.
2. While creating space, I set the Space SDK to Gradio
3. Upload the `app.py` file and created `requirements.txt` and `packages.txt` for libraries and packages respectively
## Usage
1. Upload an image containing Hindi and English texts.
2. Enter a keyword to search within the extracted text.
3. The application will display the extracted text and search results.
Note: The OCR accuracy may vary depending on the image quality. Might get incorrect readings if the image has hazy words.