Spaces:

gauri-sharan
/

test-two

Sleeping

App Files Files Community

gauri-sharan commited on Sep 29, 2024

Commit

96586f0

verified ·

1 Parent(s): 1247d27

Update README.md

Browse files

Files changed (1) hide show

README.md +48 -1

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 title: img-read
 emoji: 📚
-colorFrom: blue
 colorTo: purple
 sdk: gradio
 sdk_version: 4.44.0
@@ -44,3 +44,50 @@ This application also takes advantage of **ZeroGPU** to run efficiently on power
 - Required libraries:
   ```bash
   pip install gradio byaldi transformers torch pillow

 ---
 title: img-read
 emoji: 📚
+colorFrom: indigo
 colorTo: purple
 sdk: gradio
 sdk_version: 4.44.0
 - Required libraries:
   ```bash
   pip install gradio byaldi transformers torch pillow
+## Installation
+1. Clone the repository:
+   ```bash
+   git clone <repository-url>
+   cd <repository-directory>
+2. Install the required dependencies using pip.
+3. Run the application:
+   ```bash
+   python app.py
+### Using the App
+1. **Upload an Image**: Click on the "Upload an Image" button to select and upload an image containing text.
+2. **Extract Text**: Press the "Extract Text" button to process the image and extract any text found.
+3. **Search Keywords**: Enter keywords in the search box and click "Search" to highlight matching keywords in the extracted text.
+## Code Overview
+The core functionality of the application is encapsulated in the following sections:
+- **OCR and Text Extraction**:
+  - The `ocr_and_extract` function processes the uploaded image, extracts text, and cleans the output to remove unnecessary labels.
+- **Keyword Highlighting**:
+  - The `search_keywords` function takes the extracted text and user-defined keywords, highlighting matches within the text for better visibility.
+## ZeroGPU Integration
+The application is powered by **ZeroGPU**, leveraging the **NVIDIA A100** GPU. This ensures:
+- Faster image processing and text extraction.
+- Seamless handling of large-scale models like Qwen2VL.
+- Optimal performance during high computational loads.
+## Error Handling
+The application includes basic error handling to capture and display any issues encountered during image processing. Errors will be printed to the console, and a user-friendly message will be displayed in the interface.
+## References
+- [Byaldi](https://huggingface.co/vidore/colpali) for providing the RAGMultiModalModel.
+- [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) for state-of-the-art models.
+- [ZeroGPU](https://www.zerogpu.com) for enabling efficient GPU computation with NVIDIA A100.