Spaces:
Sleeping
Sleeping
title: Deepseek RAG Chat Bot | |
emoji: π | |
colorFrom: red | |
colorTo: pink | |
sdk: streamlit | |
sdk_version: 1.41.1 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
short_description: Deepseek-RAG-Chat-Bot | |
# RAG-Powered Chatbot with Streamlit | |
This project is a Retrieval-Augmented Generation (RAG) chatbot built using Streamlit. It allows users to upload a PDF document, process it, and ask questions about its content. The application efficiently processes the document once and uses vector-based retrieval to answer queries. | |
--- | |
## Features | |
- Upload PDF documents and process them into chunks for efficient querying. | |
- Generate semantic embeddings using `sentence-transformers`. | |
- Store embeddings in a `FAISS` vector database for efficient retrieval. | |
- Use the `DeepSeek` API for question-answering capabilities. | |
- Built with Streamlit for an interactive and user-friendly UI. | |
--- | |
## Requirements | |
- Python 3.8 or higher | |
### Dependencies | |
Install the required Python libraries: | |
```plaintext | |
streamlit==1.25.0 | |
langchain==0.81.0 | |
langchain-community==0.1.2 | |
faiss-cpu==1.7.4 | |
sentence-transformers==2.2.2 | |
pypdf==3.8.1 | |
``` | |
To install all dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
--- | |
## Setup and Usage | |
### 1. Clone the Repository | |
```bash | |
git clone https://github.com/your-username/rag-chatbot.git | |
cd rag-chatbot | |
``` | |
### 2. Install Dependencies | |
```bash | |
pip install -r requirements.txt | |
``` | |
### 3. Run the Application | |
Run the Streamlit application: | |
```bash | |
streamlit run app.py | |
``` | |
### 4. Interact with the Chatbot | |
1. Enter your `DeepSeek API Key` in the provided input field. | |
2. Upload a PDF document. | |
3. Ask questions about the content of the document. | |
--- | |
## Project Structure | |
```plaintext | |
. | |
βββ app.py # Main application code | |
βββ requirements.txt # List of dependencies | |
βββ README.md # Documentation | |
``` | |
--- | |
## Key Technologies Used | |
1. **Streamlit**: | |
- For building a user-friendly web interface. | |
2. **LangChain**: | |
- For document loading, text splitting, and RAG pipeline. | |
3. **FAISS**: | |
- For storing and querying vector embeddings. | |
4. **Sentence Transformers**: | |
- For generating semantic embeddings of text chunks. | |
5. **PyPDF**: | |
- For parsing PDF files. | |
6. **DeepSeek API**: | |
- For question-answering capabilities. | |
--- | |
## How It Works | |
1. **PDF Upload**: | |
- The user uploads a PDF document. | |
- The document is split into manageable text chunks. | |
2. **Embeddings Generation**: | |
- Semantic embeddings are generated using `sentence-transformers`. | |
3. **Vector Storage**: | |
- The embeddings are stored in a `FAISS` vector database for efficient retrieval. | |
4. **Question Answering**: | |
- The user asks a question about the uploaded document. | |
- The RAG pipeline retrieves relevant chunks and generates a response using the `DeepSeek` API. | |
--- | |
## Troubleshooting | |
- **Error: `pypdf package not found`** | |
Ensure `pypdf` is installed. Run: | |
```bash | |
pip install pypdf | |
``` | |
- **Error: `langchain-community module not found`** | |
Ensure `langchain-community` is installed. Run: | |
```bash | |
pip install langchain-community | |
``` | |
- **Reprocessing PDF on Every Query** | |
This issue is resolved by using `st.session_state` to persist the processed `vector_store`. | |
--- | |
## Future Improvements | |
1. Add support for multiple file uploads. | |
2. Integrate additional language models. | |
3. Enhance the UI with better visualization of document content. | |
4. Add support for other document formats (e.g., Word, TXT). | |
--- | |
## License | |
This project is licensed under the MIT License. See the `LICENSE` file for more details. | |
--- | |
## Contributions | |
Contributions are welcome! Feel free to fork the repository and submit a pull request. | |
--- | |
## Contact | |
For any queries or support, please contact: | |
- Name: [Sagun Chalise] | |
- Email: [[email protected]] | |
--- | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |