DocBot / README.md
aadi8anant's picture
Update README.md
4adab6f verified

A newer version of the Streamlit SDK is available: 1.44.1

Upgrade
metadata
title: 'DocBot: Smart Document ChatBot'
emoji: πŸ€–
colorFrom: indigo
colorTo: purple
sdk: streamlit
sdk_version: 0.87.0
app_file: app.py
pinned: false

πŸ€– DocBot: Smart Document ChatBot

DocBot is an intelligent document processing application with a chatbot interface. It can process various types of documents, including PDFs and images, extract essential information, and enable user interaction through a chat interface.

⭐️ Features

  • Document Upload: Upload PDF, PNG, JPG, or JPEG files for processing.
  • Text Extraction: Extract text content from uploaded documents.
  • Image Processing: Convert PDF documents to images and extract text from images.
  • Chatbot Interface: Interact with the document through a chatbot interface powered by Groq.
  • Natural Language Understanding: Utilizes spaCy for natural language processing.
  • Dynamic Progress Bar: Visual feedback on document processing progress.
  • Error Handling: Provides error messages for any processing failures.

βš™οΈ Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/docbot.git
    
  2. Install the required Python packages:

    pip install -r requirements.txt
    
  3. Set up the environment variables:

    Create a .env file in the root directory and add the following:

    GROQ_API_KEY='your_groq_api_key'
    
  4. Run the Streamlit app:

    streamlit run app.py
    

πŸš€ Usage

  1. Run the Streamlit app using the provided installation instructions.
  2. Upload your document using the file uploader.
  3. Wait for the document to be processed.
  4. Interact with the document by asking questions in the chatbot interface.

πŸ’» Technologies Used

  • Streamlit - For building the interactive web application.
  • PyPDF2 - For PDF document processing.
  • pdf2image - For converting PDFs to images.
  • PyMuPDF - For PDF document rendering.
  • Tesseract OCR - For extracting text from images.
  • spaCy - For natural language processing.
  • Groq - For AI-powered chatbot interaction.
  • Pillow - For image processing.