Komal01 commited on
Commit
ccfb9e1
·
verified ·
1 Parent(s): 617971b

Upload 8 files

Browse files
Files changed (8) hide show
  1. .Dockerignore +2 -0
  2. .opik.config +5 -0
  3. Dockerfile +30 -0
  4. README.md +127 -11
  5. app.py +240 -0
  6. requirements.txt +0 -0
  7. start.sh +39 -0
  8. streamlit_app.py +76 -0
.Dockerignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ .env
2
+ __pycache__/
.opik.config ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ [opik]
2
+ url_override = https://www.comet.com/opik/api/
3
+ workspace = komalgupta991000-gmail-com
4
+ api_key = BX9OYn3NZBKuztCxL4XvMOeeI
5
+
Dockerfile ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11.4-slim-buster
2
+
3
+
4
+
5
+ # Install curl and Ollama
6
+ RUN apt-get update && apt-get install -y curl && \
7
+ curl -fsSL https://ollama.ai/install.sh | sh && \
8
+ apt-get clean && rm -rf /var/lib/apt/lists/*
9
+
10
+ # Set up user and environment
11
+ RUN useradd -m -u 1000 user
12
+ USER user
13
+ ENV HOME=/home/user \
14
+ PATH="/home/user/.local/bin:$PATH"
15
+
16
+ WORKDIR $HOME/app
17
+
18
+ COPY --chown=user requirements.txt .
19
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
20
+ COPY . .
21
+
22
+
23
+ COPY --chown=user . .
24
+
25
+ # Make the start script executable
26
+ RUN chmod +x start.sh
27
+ # Expose FastAPI & Streamlit ports
28
+ EXPOSE 7860 8501
29
+
30
+ CMD ["./start.sh"]
README.md CHANGED
@@ -1,11 +1,127 @@
1
- ---
2
- title: Streaming RAG Chatbot
3
- emoji: 👀
4
- colorFrom: blue
5
- colorTo: yellow
6
- sdk: docker
7
- pinned: false
8
- license: apache-2.0
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AI Assistant API
2
+
3
+ ## 🚀 Overview
4
+
5
+ This project is an AI-powered assistant that uses FastAPI and FAISS for retrieval-augmented generation (RAG). It processes user queries using a vector database and evaluates responses with Opik.
6
+
7
+ ## 🛠️ Features
8
+
9
+ - Upload and manage datasets
10
+ - Query AI assistant with domain-specific constraints
11
+ - Use FAISS for efficient document retrieval
12
+ - Evaluate LLM responses using Opik
13
+
14
+ ## 📽️ Demo Video
15
+
16
+ [🎥 Click here to watch the demo](https://drive.google.com/file/d/10h4VnTm_y5SBczI6NnoTuqRxyq55HAn5/view?usp=sharing)
17
+
18
+
19
+ ## 📦 Installation
20
+
21
+ ### Install Ollama
22
+
23
+ Ollama is required for this project. Follow these steps to install it:
24
+
25
+ ```bash
26
+ # For macOS
27
+ brew install ollama
28
+
29
+ # For Linux
30
+ curl -fsSL https://ollama.ai/install.sh | sh
31
+
32
+ # Verify installation
33
+ ollama --version
34
+
35
+ # Windows
36
+ You can download from web https://ollama.com/
37
+ ```
38
+ # Clone and Setup the Project
39
+ ## Clone the repository
40
+ ```
41
+ git clone https://github.com/Komal-99/cyfuture_bot.git
42
+ ```
43
+
44
+ ## Navigate to the project directory
45
+ ```
46
+ cd cyfuture_bot
47
+ ```
48
+
49
+ ## Install dependencies
50
+ ```
51
+ pip install -r requirements.txt # For Python projects
52
+ yarn install # For JavaScript projects
53
+ ```
54
+
55
+ 🚀 Usage
56
+ Start the Project
57
+ Run the ```start.sh``` script to set up and launch the application:
58
+ ```
59
+ chmod +x start.sh
60
+ ./start.sh
61
+
62
+ ```
63
+ This script:
64
+
65
+ Sets environment variables for optimization
66
+
67
+ Starts Ollama in the background
68
+
69
+ Pulls required models (deepseek-r1:7b, nomic-embed-text)
70
+
71
+ Waits for Ollama to initialize
72
+
73
+ ### Launches the FastAPI server on http://127.0.0.1:7860
74
+ ### Streamlit Application - http://127.0.0.1:8501
75
+
76
+ ## API Endpoints
77
+ Upload Dataset
78
+ ```
79
+ POST /upload_dataset/ #Upload an Excel dataset to be used for evaluation.
80
+ ```
81
+ Run Evaluation
82
+ ```
83
+ POST /run_evaluation/ #Evaluate the model's performance using Opik.
84
+
85
+ ```
86
+ Query AI Assistant
87
+ ```
88
+ GET /query/?input_text=your_question # Ask the assistant a question. The model retrieves relevant information and generates an answer based on indexed documents.
89
+
90
+ ```
91
+ 📂 Folder Structure
92
+
93
+ ```
94
+ .
95
+ ├── AI_Agent/ # Datasource
96
+ ├── deepseek_cyfuture/ # DeepSeek Vector db
97
+ ├── .env # Environment variables
98
+ ├── .gitignore # Files to ignore in Git
99
+ ├── dataset.xlsx # Sample dataset file
100
+ ├── Dockerfile # Docker configuration
101
+ ├── requirements.txt # Dependencies (Python projects)
102
+ ├── start.sh # Startup script
103
+ ├── app.py # Main application file
104
+ ├── README.md # Project documentation
105
+
106
+ ```
107
+ 🤝 Contributing
108
+ Contributions are welcome! Please follow these steps:
109
+
110
+ Fork the repository
111
+
112
+ Create a new branch (git checkout -b feature-branch)
113
+
114
+ Commit your changes (git commit -m 'Add new feature')
115
+
116
+ Push to the branch (git push origin feature-branch)
117
+
118
+ Create a pull request
119
+
120
+ 📜 License
121
+ This project is licensed under the MIT License - see the LICENSE file for details.
122
+
123
+ 📬 Contact
124
+ For questions or issues, reach out:
125
+
126
+ GitHub: https://github.com/Komal-99
127
+
app.py ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import re
3
+ import pandas as pd
4
+ import backoff
5
+ import asyncio
6
+ from datetime import datetime
7
+ from dotenv import load_dotenv
8
+ from langchain_ollama import OllamaEmbeddings, ChatOllama
9
+ from langchain_community.vectorstores import FAISS
10
+
11
+ from langchain_core.prompts import ChatPromptTemplate
12
+ from langchain_core.output_parsers import StrOutputParser
13
+ from langchain_core.runnables import RunnablePassthrough
14
+ from opik import Opik, track, evaluate
15
+ from opik.evaluation.metrics import Hallucination, AnswerRelevance
16
+ import litellm
17
+ import opik
18
+ from fastapi.responses import StreamingResponse
19
+ from litellm.integrations.opik.opik import OpikLogger
20
+ from litellm import completion, APIConnectionError
21
+ from fastapi import FastAPI, UploadFile, File, HTTPException, Query, Response
22
+
23
+ from langchain.document_loaders import PyMuPDFLoader, UnstructuredWordDocumentLoader
24
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
25
+
26
+ app = FastAPI()
27
+
28
+ def initialize_opik():
29
+ opik_logger = OpikLogger()
30
+ litellm.callbacks = [opik_logger]
31
+ opik.configure(api_key=os.getenv("OPIK_API_KEY"),workspace=os.getenv("workspace"),force=True)
32
+
33
+
34
+ # Initialize Opik and load environment variables
35
+ load_dotenv()
36
+ initialize_opik()
37
+
38
+ # Initialize Opik Client
39
+ dataset = Opik().get_or_create_dataset(
40
+ name="Cyfuture_faq",
41
+ description="Dataset on IGL FAQ",
42
+ )
43
+
44
+ @app.post("/upload_dataset/")
45
+ def upload_dataset(file: UploadFile = File(...)):
46
+ try:
47
+ df = pd.read_excel(file.file)
48
+ dataset.insert(df.to_dict(orient='records'))
49
+ return {"message": "Dataset uploaded successfully"}
50
+ except Exception as e:
51
+ raise HTTPException(status_code=500, detail=str(e))
52
+
53
+ # To use the uploaded dataset in the evaluation task manually
54
+ def upload_dataset():
55
+ df = pd.read_excel("dataset.xlsx")
56
+ dataset.insert(df.to_dict(orient='records'))
57
+ return "Dataset uploaded successfully"
58
+
59
+ # Initialize LLM Models
60
+ model = ChatOllama(model="deepseek-r1:7b", base_url="http://localhost:11434", temperature=0.2, max_tokens=200)
61
+
62
+ def load_documents(folder_path):
63
+ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
64
+ all_documents = []
65
+ os.makedirs('data', exist_ok=True)
66
+
67
+ for filename in os.listdir(folder_path):
68
+ file_path = os.path.join(folder_path, filename)
69
+
70
+ if filename.endswith('.pdf'):
71
+ loader = PyMuPDFLoader(file_path)
72
+ elif filename.endswith('.docx'):
73
+ loader = UnstructuredWordDocumentLoader(file_path)
74
+ else:
75
+ continue # Skip unsupported files
76
+
77
+ documents = loader.load()
78
+ all_documents.extend(text_splitter.split_documents(documents))
79
+ print(f"Processed and indexed {filename}")
80
+
81
+ return all_documents
82
+ # Vector Store Setup
83
+ def setup_vector_store(documents):
84
+ embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
85
+ vectorstore = FAISS.from_documents(documents, embeddings)
86
+ vectorstore.save_local("deepseek_cyfuture")
87
+ return vectorstore
88
+
89
+
90
+ # Create RAG Chain
91
+ def create_rag_chain(retriever):
92
+ prompt_template = ChatPromptTemplate.from_template(
93
+ """
94
+ You are an AI questiona answering assistant specialized in answering user queries strictly from the provided context. Give detailed answer to user question considering the context.
95
+
96
+ STRICT RULES:
97
+ - You *must not* answer any questions outside the provided context.
98
+ - If the question is unrelated to billing, payments, customer, or meter reading, respond with exactly:
99
+ **"This question is outside my specialized domain."**
100
+ - Do NOT attempt to generate an answer from loosely related context.
101
+ - If the context does not contain a valid answer, simply state: **"I don't know the answer."**
102
+
103
+ VALIDATION STEP:
104
+ 1. Check if the query is related to **billing, payments, customer, or meter reading**.
105
+ 2. If NOT, respond with: `"This question is outside my specialized domain."` and nothing else.
106
+ 3. If the context does not contain relevant data try to find best possible answer from the context.
107
+ 4. Do NOT generate speculative answers.
108
+ 5. if the generated answer don't adress the question then try to find the best possible answer from the context you can add more releavnt context to the answer.
109
+
110
+ Question: {question}
111
+ Context: {context}
112
+ Answer:
113
+ """
114
+
115
+ )
116
+ return (
117
+ {"context": retriever | format_docs, "question": RunnablePassthrough()}
118
+ | prompt_template
119
+ | model
120
+ | StrOutputParser()
121
+ )
122
+
123
+ def format_docs(docs):
124
+ return "\n\n".join(doc.page_content for doc in docs)
125
+
126
+ def clean_response(response):
127
+ return re.sub(r'<think>.*?</think>', '', response, flags=re.DOTALL).strip()
128
+
129
+
130
+
131
+ @track()
132
+ def llm_chain(input_text):
133
+ try:
134
+ context = "\n".join(doc.page_content for doc in retriever.invoke(input_text))
135
+ response = "".join(chunk for chunk in rag_chain.stream(input_text) if isinstance(chunk, str))
136
+ return {"response": clean_response(response), "context_used": context}
137
+ except Exception as e:
138
+ return {"error": str(e)}
139
+
140
+ def evaluation_task(x):
141
+ try:
142
+ result = llm_chain(x['user_question'])
143
+ return {"input": x['user_question'], "output": result["response"], "context": result["context_used"], "expected": x['expected_output']}
144
+ except Exception as e:
145
+ return {"input": x['user_question'], "output": "", "context": x['expected_output']}
146
+
147
+ # experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
148
+ # metrics = [Hallucination(model=model1), AnswerRelevance(model=model1)]
149
+
150
+
151
+ @app.post("/run_evaluation/")
152
+ @backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
153
+ def run_evaluation():
154
+ experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
155
+ metrics = [Hallucination(), AnswerRelevance()]
156
+ try:
157
+ evaluate(
158
+ experiment_name=experiment_name,
159
+ dataset=dataset,
160
+ task=evaluation_task,
161
+ scoring_metrics=metrics,
162
+ experiment_config={"model": model},
163
+ task_threads=2
164
+ )
165
+ return {"message": "Evaluation completed successfully"}
166
+ except Exception as e:
167
+ raise HTTPException(status_code=500, detail=str(e))
168
+
169
+
170
+ # @backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
171
+ # def run_evaluation():
172
+ # return evaluate(experiment_name=experiment_name, dataset=dataset, task=evaluation_task, scoring_metrics=metrics, experiment_config={"model": model}, task_threads=2)
173
+
174
+ # run_evaluation()
175
+
176
+ # Create Vector Database
177
+ def create_db():
178
+ source = r'AI Agent'
179
+ markdown_content = load_documents(source)
180
+ setup_vector_store(markdown_content)
181
+ return "Database created successfully"
182
+
183
+ embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
184
+ vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
185
+ retriever = vectorstore.as_retriever( search_kwargs={'k': 2})
186
+ rag_chain = create_rag_chain(retriever)
187
+
188
+ @track()
189
+ @app.get("/query/")
190
+ def chain(input_text: str = Query(..., description="Enter your question")):
191
+ try:
192
+ # def generate():
193
+ # for chunk in rag_chain.stream(input_text):
194
+ # if isinstance(chunk, str):
195
+ # yield chunk
196
+ def generate():
197
+ buffer = "" # Temporary buffer to hold chunks until `</think>` is found
198
+ start_sending = False
199
+
200
+ for chunk in rag_chain.stream(input_text):
201
+ # if isinstance(chunk, str):
202
+ # buffer += chunk # Append chunk to buffer
203
+
204
+ # # Check if `</think>` is found
205
+ # if "</think>" in buffer:
206
+ # start_sending = True
207
+ # # Yield everything after `</think>` (including `</think>` itself)
208
+ # yield buffer.split("</think>", 1)[1]
209
+ # buffer = "" # Clear the buffer after sending the first response
210
+ # elif start_sending:
211
+ yield chunk # Continue yielding after the `</think>` tag
212
+ return StreamingResponse(generate(), media_type="text/plain")
213
+
214
+ except Exception as e:
215
+ raise HTTPException(status_code=500, detail=str(e))
216
+ @app.get("/")
217
+ def read_root():
218
+ return {"message": "Welcome to the AI Assistant API!"}
219
+
220
+ if __name__ == "__main__":
221
+ # start my fastapi app
222
+ import uvicorn
223
+ uvicorn.run(app, host="127.0.0.1", port=7860)
224
+
225
+
226
+ # questions=[ "Is the website accessible through mobile also? please tell the benefits of it","How do I register for a new connection?","how to make payments?",]
227
+ # # Questions for retrieval
228
+ # # Answer questions
229
+ # create_db()
230
+ # # Load Vector Store
231
+ # embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
232
+ # vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
233
+ # retriever = vectorstore.as_retriever( search_kwargs={'k': 3})
234
+ # rag_chain = create_rag_chain(retriever)
235
+
236
+ # for question in questions:
237
+ # print(f"Question: {question}")
238
+ # for chunk in rag_chain.stream(question):
239
+ # print(chunk, end="", flush=True)
240
+ # print("\n" + "-" * 50 + "\n")
requirements.txt ADDED
Binary file (8.09 kB). View file
 
start.sh ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ # Set environment variables for optimization
4
+ export OMP_NUM_THREADS=4
5
+ export MKL_NUM_THREADS=4
6
+ export CUDA_VISIBLE_DEVICES=0,1
7
+
8
+ # Start Ollama in the background
9
+ ollama serve &
10
+
11
+ # Pull the model if not already present
12
+ if ! ollama list | grep -q "deepseek-r1:7b"; then
13
+ ollama pull deepseek-r1:7b
14
+ fi
15
+ if ! ollama list | grep -q "nomic-embed-text"; then
16
+ ollama pull nomic-embed-text
17
+ fi
18
+ # Wait for Ollama to start up
19
+ max_attempts=30
20
+ attempt=0
21
+ while ! curl -s http://localhost:11434/api/tags >/dev/null; do
22
+ sleep 1
23
+ attempt=$((attempt + 1))
24
+ if [ $attempt -eq $max_attempts ]; then
25
+ echo "Ollama failed to start within 30 seconds. Exiting."
26
+ exit 1
27
+ fi
28
+ done
29
+
30
+ echo "Ollama is ready."
31
+
32
+ # Print the API URL
33
+ echo "API is running on: http://0.0.0.0:7860"
34
+
35
+ # Start FastAPI in the background
36
+ uvicorn app:app --host 0.0.0.0 --port 7860 --workers 4 --limit-concurrency 20 &
37
+
38
+ # Start Streamlit for UI
39
+ streamlit run streamlit_app.py --server.port 8501 --server.address 0.0.0.0
streamlit_app.py ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ import streamlit as st
3
+ import requests
4
+ import re # For space cleanup
5
+
6
+ st.set_page_config(page_title="AI Chatbot", layout="centered")
7
+ st.title("🤖 AI Chatbot")
8
+
9
+ if "messages" not in st.session_state:
10
+ st.session_state.messages = []
11
+
12
+ # Function to query AI API and stream response
13
+ def query_ai(question):
14
+ url = "http://127.0.0.1:7860/query/"
15
+ params = {"input_text": question}
16
+
17
+ with requests.get(url, params=params, stream=True) as response:
18
+ if response.status_code == 200:
19
+ full_response = ""
20
+ for chunk in response.iter_content(chunk_size=1024):
21
+ if chunk:
22
+ text_chunk = chunk.decode("utf-8")
23
+ full_response += text_chunk
24
+ yield full_response # Streamed response
25
+
26
+ # Custom CSS for spacing fix
27
+ st.markdown("""
28
+ <style>
29
+ .chat-box {
30
+ background-color: #1e1e1e;
31
+ padding: 12px;
32
+ border-radius: 10px;
33
+ margin-top: 5px;
34
+ font-size: 154x;
35
+ font-family: monospace;
36
+ white-space: pre-wrap;
37
+ word-wrap: break-word;
38
+ line-height: 1.2;
39
+ color: #ffffff;
40
+ }
41
+ </style>
42
+ """, unsafe_allow_html=True)
43
+
44
+ user_input = st.text_input("Ask a question:", "", key="user_input")
45
+ submit_button = st.button("Submit")
46
+
47
+ if submit_button and user_input:
48
+ st.session_state.messages.append({"role": "user", "content": user_input})
49
+
50
+ # Placeholder for streaming
51
+ response_container = st.empty()
52
+ full_response = ""
53
+
54
+ with st.spinner("🤖 AI is thinking..."):
55
+ for chunk in query_ai(user_input):
56
+ full_response = chunk
57
+ response_container.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
58
+
59
+ response_container.empty() # Hides the streamed "Thinking" response after completion
60
+
61
+ # Extract refined answer after "</think>"
62
+ if "</think>" in full_response:
63
+ refined_response = full_response.split("</think>", 1)[-1].strip()
64
+ else:
65
+ refined_response = full_response # Fallback if </think> is missing
66
+
67
+ # Remove extra newlines and excessive spaces
68
+ refined_response = re.sub(r'\n\s*\n', '\n', refined_response.strip())
69
+
70
+ # Expandable AI Thought Process Box
71
+ with st.expander("🤖 AI's Thought Process (Click to Expand)"):
72
+ st.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
73
+
74
+ # Display refined answer with clean formatting
75
+ st.write("Answer:")
76
+ st.markdown(refined_response, unsafe_allow_html=True)