Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
title: img-read
|
3 |
emoji: π
|
4 |
-
colorFrom:
|
5 |
colorTo: purple
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.44.0
|
@@ -44,3 +44,50 @@ This application also takes advantage of **ZeroGPU** to run efficiently on power
|
|
44 |
- Required libraries:
|
45 |
```bash
|
46 |
pip install gradio byaldi transformers torch pillow
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: img-read
|
3 |
emoji: π
|
4 |
+
colorFrom: indigo
|
5 |
colorTo: purple
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.44.0
|
|
|
44 |
- Required libraries:
|
45 |
```bash
|
46 |
pip install gradio byaldi transformers torch pillow
|
47 |
+
|
48 |
+
## Installation
|
49 |
+
|
50 |
+
1. Clone the repository:
|
51 |
+
```bash
|
52 |
+
git clone <repository-url>
|
53 |
+
cd <repository-directory>
|
54 |
+
|
55 |
+
2. Install the required dependencies using pip.
|
56 |
+
|
57 |
+
3. Run the application:
|
58 |
+
```bash
|
59 |
+
python app.py
|
60 |
+
|
61 |
+
### Using the App
|
62 |
+
|
63 |
+
1. **Upload an Image**: Click on the "Upload an Image" button to select and upload an image containing text.
|
64 |
+
2. **Extract Text**: Press the "Extract Text" button to process the image and extract any text found.
|
65 |
+
3. **Search Keywords**: Enter keywords in the search box and click "Search" to highlight matching keywords in the extracted text.
|
66 |
+
|
67 |
+
## Code Overview
|
68 |
+
|
69 |
+
The core functionality of the application is encapsulated in the following sections:
|
70 |
+
|
71 |
+
- **OCR and Text Extraction**:
|
72 |
+
- The `ocr_and_extract` function processes the uploaded image, extracts text, and cleans the output to remove unnecessary labels.
|
73 |
+
|
74 |
+
- **Keyword Highlighting**:
|
75 |
+
- The `search_keywords` function takes the extracted text and user-defined keywords, highlighting matches within the text for better visibility.
|
76 |
+
|
77 |
+
## ZeroGPU Integration
|
78 |
+
|
79 |
+
The application is powered by **ZeroGPU**, leveraging the **NVIDIA A100** GPU. This ensures:
|
80 |
+
- Faster image processing and text extraction.
|
81 |
+
- Seamless handling of large-scale models like Qwen2VL.
|
82 |
+
- Optimal performance during high computational loads.
|
83 |
+
|
84 |
+
## Error Handling
|
85 |
+
|
86 |
+
The application includes basic error handling to capture and display any issues encountered during image processing. Errors will be printed to the console, and a user-friendly message will be displayed in the interface.
|
87 |
+
|
88 |
+
## References
|
89 |
+
|
90 |
+
- [Byaldi](https://huggingface.co/vidore/colpali) for providing the RAGMultiModalModel.
|
91 |
+
- [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) for state-of-the-art models.
|
92 |
+
- [ZeroGPU](https://www.zerogpu.com) for enabling efficient GPU computation with NVIDIA A100.
|
93 |
+
|