Spaces:

chalisesagun
/

deepseek-chat

Sleeping

App Files Files Community

deepseek-chat / README.md

chalisesagun

Update README.md

f8c5d54 verified 3 months ago

preview code

raw

history blame contribute delete

3.96 kB

	---
	title: Deepseek RAG Chat Bot
	emoji: 📈
	colorFrom: red
	colorTo: pink
	sdk: streamlit
	sdk_version: 1.41.1
	app_file: app.py
	pinned: false
	license: apache-2.0
	short_description: Deepseek-RAG-Chat-Bot
	---

	# RAG-Powered Chatbot with Streamlit

	This project is a Retrieval-Augmented Generation (RAG) chatbot built using Streamlit. It allows users to upload a PDF document, process it, and ask questions about its content. The application efficiently processes the document once and uses vector-based retrieval to answer queries.

	---

	## Features

	- Upload PDF documents and process them into chunks for efficient querying.
	- Generate semantic embeddings using `sentence-transformers`.
	- Store embeddings in a `FAISS` vector database for efficient retrieval.
	- Use the `DeepSeek` API for question-answering capabilities.
	- Built with Streamlit for an interactive and user-friendly UI.

	---

	## Requirements

	- Python 3.8 or higher

	### Dependencies

	Install the required Python libraries:

	```plaintext
	streamlit==1.25.0
	langchain==0.81.0
	langchain-community==0.1.2
	faiss-cpu==1.7.4
	sentence-transformers==2.2.2
	pypdf==3.8.1
	```

	To install all dependencies:

	```bash
	pip install -r requirements.txt
	```

	---

	## Setup and Usage

	### 1. Clone the Repository

	```bash
	git clone https://github.com/your-username/rag-chatbot.git
	cd rag-chatbot
	```

	### 2. Install Dependencies

	```bash
	pip install -r requirements.txt
	```

	### 3. Run the Application

	Run the Streamlit application:

	```bash
	streamlit run app.py
	```

	### 4. Interact with the Chatbot

	1. Enter your `DeepSeek API Key` in the provided input field.
	2. Upload a PDF document.
	3. Ask questions about the content of the document.

	---

	## Project Structure

	```plaintext
	.
	├── app.py # Main application code
	├── requirements.txt # List of dependencies
	├── README.md # Documentation
	```

	---

	## Key Technologies Used

	1. Streamlit:
	- For building a user-friendly web interface.

	2. LangChain:
	- For document loading, text splitting, and RAG pipeline.

	3. FAISS:
	- For storing and querying vector embeddings.

	4. Sentence Transformers:
	- For generating semantic embeddings of text chunks.

	5. PyPDF:
	- For parsing PDF files.

	6. DeepSeek API:
	- For question-answering capabilities.

	---

	## How It Works

	1. PDF Upload:
	- The user uploads a PDF document.
	- The document is split into manageable text chunks.

	2. Embeddings Generation:
	- Semantic embeddings are generated using `sentence-transformers`.

	3. Vector Storage:
	- The embeddings are stored in a `FAISS` vector database for efficient retrieval.

	4. Question Answering:
	- The user asks a question about the uploaded document.
	- The RAG pipeline retrieves relevant chunks and generates a response using the `DeepSeek` API.

	---

	## Troubleshooting

	- Error: `pypdf package not found`
	Ensure `pypdf` is installed. Run:
	```bash
	pip install pypdf
	```

	- Error: `langchain-community module not found`
	Ensure `langchain-community` is installed. Run:
	```bash
	pip install langchain-community
	```

	- Reprocessing PDF on Every Query
	This issue is resolved by using `st.session_state` to persist the processed `vector_store`.

	---

	## Future Improvements

	1. Add support for multiple file uploads.
	2. Integrate additional language models.
	3. Enhance the UI with better visualization of document content.
	4. Add support for other document formats (e.g., Word, TXT).

	---

	## License

	This project is licensed under the MIT License. See the `LICENSE` file for more details.

	---

	## Contributions

	Contributions are welcome! Feel free to fork the repository and submit a pull request.

	---

	## Contact

	For any queries or support, please contact:

	- Name: [Sagun Chalise]
	- Email: [[email protected]]


	---


	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference