Spaces:

ayanika02
/

OCR-IITRoorkie

Running

OCR-IITRoorkie / README.md

Update README.md

dba8012 verified 5 months ago

1.55 kB

	---
	title: OCR IITRoorkie
	emoji: 📚
	colorFrom: indigo
	colorTo: green
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference


	# OCR and Keyword Search Web Application

	This web application performs Optical Character Recognition (OCR) on uploaded images containing text in both Hindi and English,
	and provides a keyword search functionality.

	## Setup

	1. Install the required dependencies:
	pip install -r requirements.txt
	This contains crucial libraries like transformers, gradio, pillow, tesseract, pytesseract

	2. Install Tesseract OCR:
	For Windows,
	Download and install from https://github.com/UB-Mannheim/tesseract/wiki

	3. Update the tesseract path in script (this was not needed while deploying to Hugging Face Space but had to use it while running it locally on my machine)

	## Running Locally

	To run the application locally:
	python app.py

	## Deployment

	To deploy on Hugging Face Spaces:

	1. Created a new Space on Hugging Face.
	2. While creating space, I set the Space SDK to Gradio
	3. Upload the `app.py` file and created `requirements.txt` and `packages.txt` for libraries and packages respectively

	## Usage

	1. Upload an image containing Hindi and English texts.
	2. Enter a keyword to search within the extracted text.
	3. The application will display the extracted text and search results.

	Note: The OCR accuracy may vary depending on the image quality. Might get incorrect readings if the image has hazy words.