Spaces:
Runtime error
Runtime error
A newer version of the Streamlit SDK is available:
1.44.1
metadata
title: 'DocBot: Smart Document ChatBot'
emoji: π€
colorFrom: indigo
colorTo: purple
sdk: streamlit
sdk_version: 0.87.0
app_file: app.py
pinned: false
π€ DocBot: Smart Document ChatBot
DocBot is an intelligent document processing application with a chatbot interface. It can process various types of documents, including PDFs and images, extract essential information, and enable user interaction through a chat interface.
βοΈ Features
- Document Upload: Upload PDF, PNG, JPG, or JPEG files for processing.
- Text Extraction: Extract text content from uploaded documents.
- Image Processing: Convert PDF documents to images and extract text from images.
- Chatbot Interface: Interact with the document through a chatbot interface powered by Groq.
- Natural Language Understanding: Utilizes spaCy for natural language processing.
- Dynamic Progress Bar: Visual feedback on document processing progress.
- Error Handling: Provides error messages for any processing failures.
βοΈ Installation
Clone the repository:
git clone https://github.com/yourusername/docbot.git
Install the required Python packages:
pip install -r requirements.txt
Set up the environment variables:
Create a
.env
file in the root directory and add the following:GROQ_API_KEY='your_groq_api_key'
Run the Streamlit app:
streamlit run app.py
π Usage
- Run the Streamlit app using the provided installation instructions.
- Upload your document using the file uploader.
- Wait for the document to be processed.
- Interact with the document by asking questions in the chatbot interface.
π» Technologies Used
- Streamlit - For building the interactive web application.
- PyPDF2 - For PDF document processing.
- pdf2image - For converting PDFs to images.
- PyMuPDF - For PDF document rendering.
- Tesseract OCR - For extracting text from images.
- spaCy - For natural language processing.
- Groq - For AI-powered chatbot interaction.
- Pillow - For image processing.