gauri-sharan commited on
Commit
1247d27
·
verified ·
1 Parent(s): 5a2fd5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -9,4 +9,38 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
13
+
14
+ # Byaldi + Qwen2VL
15
+
16
+ ## Overview
17
+
18
+ The **Byaldi + Qwen2VL** app is an innovative tool designed for extracting text from images using advanced OCR (Optical Character Recognition) techniques and natural language processing. This application leverages the **RAGMultiModalModel** from Byaldi and the **Qwen2VL** model for generating meaningful responses based on the extracted text.
19
+
20
+ This application also takes advantage of **ZeroGPU** to run efficiently on powerful hardware, specifically the **NVIDIA A100** GPU, ensuring high-speed processing and accurate results even for large and complex image inputs.
21
+
22
+ ## Features
23
+
24
+ - **Image Upload**: Users can upload images from which text will be extracted.
25
+ - **Text Extraction**: Utilizes state-of-the-art models to accurately extract text from the uploaded images.
26
+ - **Keyword Search**: Allows users to search for specific keywords within the extracted text and highlights them.
27
+ - **High-Performance**: Runs on **ZeroGPU (NVIDIA A100)** for accelerated computation and efficient model execution.
28
+ - **User-Friendly Interface**: Built using Gradio for an interactive user experience.
29
+
30
+ ## Technologies Used
31
+
32
+ - **Gradio**: For creating the web interface.
33
+ - **Byaldi RAGMultiModalModel**: For indexing and searching images.
34
+ - **Qwen2VL**: For generating responses based on visual and textual inputs.
35
+ - **ZeroGPU**: For efficient model inference using **NVIDIA A100**.
36
+ - **PyTorch**: For deep learning functionalities.
37
+ - **Pillow**: For image handling.
38
+
39
+ ## Getting Started
40
+
41
+ ### Prerequisites
42
+
43
+ - Python 3.8 or later
44
+ - Required libraries:
45
+ ```bash
46
+ pip install gradio byaldi transformers torch pillow