gauri-sharan commited on
Commit
96586f0
Β·
verified Β·
1 Parent(s): 1247d27

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -1
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  title: img-read
3
  emoji: πŸ“š
4
- colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: 4.44.0
@@ -44,3 +44,50 @@ This application also takes advantage of **ZeroGPU** to run efficiently on power
44
  - Required libraries:
45
  ```bash
46
  pip install gradio byaldi transformers torch pillow
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: img-read
3
  emoji: πŸ“š
4
+ colorFrom: indigo
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: 4.44.0
 
44
  - Required libraries:
45
  ```bash
46
  pip install gradio byaldi transformers torch pillow
47
+
48
+ ## Installation
49
+
50
+ 1. Clone the repository:
51
+ ```bash
52
+ git clone <repository-url>
53
+ cd <repository-directory>
54
+
55
+ 2. Install the required dependencies using pip.
56
+
57
+ 3. Run the application:
58
+ ```bash
59
+ python app.py
60
+
61
+ ### Using the App
62
+
63
+ 1. **Upload an Image**: Click on the "Upload an Image" button to select and upload an image containing text.
64
+ 2. **Extract Text**: Press the "Extract Text" button to process the image and extract any text found.
65
+ 3. **Search Keywords**: Enter keywords in the search box and click "Search" to highlight matching keywords in the extracted text.
66
+
67
+ ## Code Overview
68
+
69
+ The core functionality of the application is encapsulated in the following sections:
70
+
71
+ - **OCR and Text Extraction**:
72
+ - The `ocr_and_extract` function processes the uploaded image, extracts text, and cleans the output to remove unnecessary labels.
73
+
74
+ - **Keyword Highlighting**:
75
+ - The `search_keywords` function takes the extracted text and user-defined keywords, highlighting matches within the text for better visibility.
76
+
77
+ ## ZeroGPU Integration
78
+
79
+ The application is powered by **ZeroGPU**, leveraging the **NVIDIA A100** GPU. This ensures:
80
+ - Faster image processing and text extraction.
81
+ - Seamless handling of large-scale models like Qwen2VL.
82
+ - Optimal performance during high computational loads.
83
+
84
+ ## Error Handling
85
+
86
+ The application includes basic error handling to capture and display any issues encountered during image processing. Errors will be printed to the console, and a user-friendly message will be displayed in the interface.
87
+
88
+ ## References
89
+
90
+ - [Byaldi](https://huggingface.co/vidore/colpali) for providing the RAGMultiModalModel.
91
+ - [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) for state-of-the-art models.
92
+ - [ZeroGPU](https://www.zerogpu.com) for enabling efficient GPU computation with NVIDIA A100.
93
+