Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,38 @@ app_file: app.py
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
13 |
+
|
14 |
+
# Byaldi + Qwen2VL
|
15 |
+
|
16 |
+
## Overview
|
17 |
+
|
18 |
+
The **Byaldi + Qwen2VL** app is an innovative tool designed for extracting text from images using advanced OCR (Optical Character Recognition) techniques and natural language processing. This application leverages the **RAGMultiModalModel** from Byaldi and the **Qwen2VL** model for generating meaningful responses based on the extracted text.
|
19 |
+
|
20 |
+
This application also takes advantage of **ZeroGPU** to run efficiently on powerful hardware, specifically the **NVIDIA A100** GPU, ensuring high-speed processing and accurate results even for large and complex image inputs.
|
21 |
+
|
22 |
+
## Features
|
23 |
+
|
24 |
+
- **Image Upload**: Users can upload images from which text will be extracted.
|
25 |
+
- **Text Extraction**: Utilizes state-of-the-art models to accurately extract text from the uploaded images.
|
26 |
+
- **Keyword Search**: Allows users to search for specific keywords within the extracted text and highlights them.
|
27 |
+
- **High-Performance**: Runs on **ZeroGPU (NVIDIA A100)** for accelerated computation and efficient model execution.
|
28 |
+
- **User-Friendly Interface**: Built using Gradio for an interactive user experience.
|
29 |
+
|
30 |
+
## Technologies Used
|
31 |
+
|
32 |
+
- **Gradio**: For creating the web interface.
|
33 |
+
- **Byaldi RAGMultiModalModel**: For indexing and searching images.
|
34 |
+
- **Qwen2VL**: For generating responses based on visual and textual inputs.
|
35 |
+
- **ZeroGPU**: For efficient model inference using **NVIDIA A100**.
|
36 |
+
- **PyTorch**: For deep learning functionalities.
|
37 |
+
- **Pillow**: For image handling.
|
38 |
+
|
39 |
+
## Getting Started
|
40 |
+
|
41 |
+
### Prerequisites
|
42 |
+
|
43 |
+
- Python 3.8 or later
|
44 |
+
- Required libraries:
|
45 |
+
```bash
|
46 |
+
pip install gradio byaldi transformers torch pillow
|