Spaces:
Running
Running
Upload 8 files
Browse files- .Dockerignore +2 -0
- .opik.config +5 -0
- Dockerfile +30 -0
- README.md +127 -11
- app.py +240 -0
- requirements.txt +0 -0
- start.sh +39 -0
- streamlit_app.py +76 -0
.Dockerignore
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
.env
|
2 |
+
__pycache__/
|
.opik.config
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[opik]
|
2 |
+
url_override = https://www.comet.com/opik/api/
|
3 |
+
workspace = komalgupta991000-gmail-com
|
4 |
+
api_key = BX9OYn3NZBKuztCxL4XvMOeeI
|
5 |
+
|
Dockerfile
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.11.4-slim-buster
|
2 |
+
|
3 |
+
|
4 |
+
|
5 |
+
# Install curl and Ollama
|
6 |
+
RUN apt-get update && apt-get install -y curl && \
|
7 |
+
curl -fsSL https://ollama.ai/install.sh | sh && \
|
8 |
+
apt-get clean && rm -rf /var/lib/apt/lists/*
|
9 |
+
|
10 |
+
# Set up user and environment
|
11 |
+
RUN useradd -m -u 1000 user
|
12 |
+
USER user
|
13 |
+
ENV HOME=/home/user \
|
14 |
+
PATH="/home/user/.local/bin:$PATH"
|
15 |
+
|
16 |
+
WORKDIR $HOME/app
|
17 |
+
|
18 |
+
COPY --chown=user requirements.txt .
|
19 |
+
RUN pip install --no-cache-dir --upgrade -r requirements.txt
|
20 |
+
COPY . .
|
21 |
+
|
22 |
+
|
23 |
+
COPY --chown=user . .
|
24 |
+
|
25 |
+
# Make the start script executable
|
26 |
+
RUN chmod +x start.sh
|
27 |
+
# Expose FastAPI & Streamlit ports
|
28 |
+
EXPOSE 7860 8501
|
29 |
+
|
30 |
+
CMD ["./start.sh"]
|
README.md
CHANGED
@@ -1,11 +1,127 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# AI Assistant API
|
2 |
+
|
3 |
+
## 🚀 Overview
|
4 |
+
|
5 |
+
This project is an AI-powered assistant that uses FastAPI and FAISS for retrieval-augmented generation (RAG). It processes user queries using a vector database and evaluates responses with Opik.
|
6 |
+
|
7 |
+
## 🛠️ Features
|
8 |
+
|
9 |
+
- Upload and manage datasets
|
10 |
+
- Query AI assistant with domain-specific constraints
|
11 |
+
- Use FAISS for efficient document retrieval
|
12 |
+
- Evaluate LLM responses using Opik
|
13 |
+
|
14 |
+
## 📽️ Demo Video
|
15 |
+
|
16 |
+
[🎥 Click here to watch the demo](https://drive.google.com/file/d/10h4VnTm_y5SBczI6NnoTuqRxyq55HAn5/view?usp=sharing)
|
17 |
+
|
18 |
+
|
19 |
+
## 📦 Installation
|
20 |
+
|
21 |
+
### Install Ollama
|
22 |
+
|
23 |
+
Ollama is required for this project. Follow these steps to install it:
|
24 |
+
|
25 |
+
```bash
|
26 |
+
# For macOS
|
27 |
+
brew install ollama
|
28 |
+
|
29 |
+
# For Linux
|
30 |
+
curl -fsSL https://ollama.ai/install.sh | sh
|
31 |
+
|
32 |
+
# Verify installation
|
33 |
+
ollama --version
|
34 |
+
|
35 |
+
# Windows
|
36 |
+
You can download from web https://ollama.com/
|
37 |
+
```
|
38 |
+
# Clone and Setup the Project
|
39 |
+
## Clone the repository
|
40 |
+
```
|
41 |
+
git clone https://github.com/Komal-99/cyfuture_bot.git
|
42 |
+
```
|
43 |
+
|
44 |
+
## Navigate to the project directory
|
45 |
+
```
|
46 |
+
cd cyfuture_bot
|
47 |
+
```
|
48 |
+
|
49 |
+
## Install dependencies
|
50 |
+
```
|
51 |
+
pip install -r requirements.txt # For Python projects
|
52 |
+
yarn install # For JavaScript projects
|
53 |
+
```
|
54 |
+
|
55 |
+
🚀 Usage
|
56 |
+
Start the Project
|
57 |
+
Run the ```start.sh``` script to set up and launch the application:
|
58 |
+
```
|
59 |
+
chmod +x start.sh
|
60 |
+
./start.sh
|
61 |
+
|
62 |
+
```
|
63 |
+
This script:
|
64 |
+
|
65 |
+
Sets environment variables for optimization
|
66 |
+
|
67 |
+
Starts Ollama in the background
|
68 |
+
|
69 |
+
Pulls required models (deepseek-r1:7b, nomic-embed-text)
|
70 |
+
|
71 |
+
Waits for Ollama to initialize
|
72 |
+
|
73 |
+
### Launches the FastAPI server on http://127.0.0.1:7860
|
74 |
+
### Streamlit Application - http://127.0.0.1:8501
|
75 |
+
|
76 |
+
## API Endpoints
|
77 |
+
Upload Dataset
|
78 |
+
```
|
79 |
+
POST /upload_dataset/ #Upload an Excel dataset to be used for evaluation.
|
80 |
+
```
|
81 |
+
Run Evaluation
|
82 |
+
```
|
83 |
+
POST /run_evaluation/ #Evaluate the model's performance using Opik.
|
84 |
+
|
85 |
+
```
|
86 |
+
Query AI Assistant
|
87 |
+
```
|
88 |
+
GET /query/?input_text=your_question # Ask the assistant a question. The model retrieves relevant information and generates an answer based on indexed documents.
|
89 |
+
|
90 |
+
```
|
91 |
+
📂 Folder Structure
|
92 |
+
|
93 |
+
```
|
94 |
+
.
|
95 |
+
├── AI_Agent/ # Datasource
|
96 |
+
├── deepseek_cyfuture/ # DeepSeek Vector db
|
97 |
+
├── .env # Environment variables
|
98 |
+
├── .gitignore # Files to ignore in Git
|
99 |
+
├── dataset.xlsx # Sample dataset file
|
100 |
+
├── Dockerfile # Docker configuration
|
101 |
+
├── requirements.txt # Dependencies (Python projects)
|
102 |
+
├── start.sh # Startup script
|
103 |
+
├── app.py # Main application file
|
104 |
+
├── README.md # Project documentation
|
105 |
+
|
106 |
+
```
|
107 |
+
🤝 Contributing
|
108 |
+
Contributions are welcome! Please follow these steps:
|
109 |
+
|
110 |
+
Fork the repository
|
111 |
+
|
112 |
+
Create a new branch (git checkout -b feature-branch)
|
113 |
+
|
114 |
+
Commit your changes (git commit -m 'Add new feature')
|
115 |
+
|
116 |
+
Push to the branch (git push origin feature-branch)
|
117 |
+
|
118 |
+
Create a pull request
|
119 |
+
|
120 |
+
📜 License
|
121 |
+
This project is licensed under the MIT License - see the LICENSE file for details.
|
122 |
+
|
123 |
+
📬 Contact
|
124 |
+
For questions or issues, reach out:
|
125 |
+
|
126 |
+
GitHub: https://github.com/Komal-99
|
127 |
+
|
app.py
ADDED
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import re
|
3 |
+
import pandas as pd
|
4 |
+
import backoff
|
5 |
+
import asyncio
|
6 |
+
from datetime import datetime
|
7 |
+
from dotenv import load_dotenv
|
8 |
+
from langchain_ollama import OllamaEmbeddings, ChatOllama
|
9 |
+
from langchain_community.vectorstores import FAISS
|
10 |
+
|
11 |
+
from langchain_core.prompts import ChatPromptTemplate
|
12 |
+
from langchain_core.output_parsers import StrOutputParser
|
13 |
+
from langchain_core.runnables import RunnablePassthrough
|
14 |
+
from opik import Opik, track, evaluate
|
15 |
+
from opik.evaluation.metrics import Hallucination, AnswerRelevance
|
16 |
+
import litellm
|
17 |
+
import opik
|
18 |
+
from fastapi.responses import StreamingResponse
|
19 |
+
from litellm.integrations.opik.opik import OpikLogger
|
20 |
+
from litellm import completion, APIConnectionError
|
21 |
+
from fastapi import FastAPI, UploadFile, File, HTTPException, Query, Response
|
22 |
+
|
23 |
+
from langchain.document_loaders import PyMuPDFLoader, UnstructuredWordDocumentLoader
|
24 |
+
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
25 |
+
|
26 |
+
app = FastAPI()
|
27 |
+
|
28 |
+
def initialize_opik():
|
29 |
+
opik_logger = OpikLogger()
|
30 |
+
litellm.callbacks = [opik_logger]
|
31 |
+
opik.configure(api_key=os.getenv("OPIK_API_KEY"),workspace=os.getenv("workspace"),force=True)
|
32 |
+
|
33 |
+
|
34 |
+
# Initialize Opik and load environment variables
|
35 |
+
load_dotenv()
|
36 |
+
initialize_opik()
|
37 |
+
|
38 |
+
# Initialize Opik Client
|
39 |
+
dataset = Opik().get_or_create_dataset(
|
40 |
+
name="Cyfuture_faq",
|
41 |
+
description="Dataset on IGL FAQ",
|
42 |
+
)
|
43 |
+
|
44 |
+
@app.post("/upload_dataset/")
|
45 |
+
def upload_dataset(file: UploadFile = File(...)):
|
46 |
+
try:
|
47 |
+
df = pd.read_excel(file.file)
|
48 |
+
dataset.insert(df.to_dict(orient='records'))
|
49 |
+
return {"message": "Dataset uploaded successfully"}
|
50 |
+
except Exception as e:
|
51 |
+
raise HTTPException(status_code=500, detail=str(e))
|
52 |
+
|
53 |
+
# To use the uploaded dataset in the evaluation task manually
|
54 |
+
def upload_dataset():
|
55 |
+
df = pd.read_excel("dataset.xlsx")
|
56 |
+
dataset.insert(df.to_dict(orient='records'))
|
57 |
+
return "Dataset uploaded successfully"
|
58 |
+
|
59 |
+
# Initialize LLM Models
|
60 |
+
model = ChatOllama(model="deepseek-r1:7b", base_url="http://localhost:11434", temperature=0.2, max_tokens=200)
|
61 |
+
|
62 |
+
def load_documents(folder_path):
|
63 |
+
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
|
64 |
+
all_documents = []
|
65 |
+
os.makedirs('data', exist_ok=True)
|
66 |
+
|
67 |
+
for filename in os.listdir(folder_path):
|
68 |
+
file_path = os.path.join(folder_path, filename)
|
69 |
+
|
70 |
+
if filename.endswith('.pdf'):
|
71 |
+
loader = PyMuPDFLoader(file_path)
|
72 |
+
elif filename.endswith('.docx'):
|
73 |
+
loader = UnstructuredWordDocumentLoader(file_path)
|
74 |
+
else:
|
75 |
+
continue # Skip unsupported files
|
76 |
+
|
77 |
+
documents = loader.load()
|
78 |
+
all_documents.extend(text_splitter.split_documents(documents))
|
79 |
+
print(f"Processed and indexed {filename}")
|
80 |
+
|
81 |
+
return all_documents
|
82 |
+
# Vector Store Setup
|
83 |
+
def setup_vector_store(documents):
|
84 |
+
embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
85 |
+
vectorstore = FAISS.from_documents(documents, embeddings)
|
86 |
+
vectorstore.save_local("deepseek_cyfuture")
|
87 |
+
return vectorstore
|
88 |
+
|
89 |
+
|
90 |
+
# Create RAG Chain
|
91 |
+
def create_rag_chain(retriever):
|
92 |
+
prompt_template = ChatPromptTemplate.from_template(
|
93 |
+
"""
|
94 |
+
You are an AI questiona answering assistant specialized in answering user queries strictly from the provided context. Give detailed answer to user question considering the context.
|
95 |
+
|
96 |
+
STRICT RULES:
|
97 |
+
- You *must not* answer any questions outside the provided context.
|
98 |
+
- If the question is unrelated to billing, payments, customer, or meter reading, respond with exactly:
|
99 |
+
**"This question is outside my specialized domain."**
|
100 |
+
- Do NOT attempt to generate an answer from loosely related context.
|
101 |
+
- If the context does not contain a valid answer, simply state: **"I don't know the answer."**
|
102 |
+
|
103 |
+
VALIDATION STEP:
|
104 |
+
1. Check if the query is related to **billing, payments, customer, or meter reading**.
|
105 |
+
2. If NOT, respond with: `"This question is outside my specialized domain."` and nothing else.
|
106 |
+
3. If the context does not contain relevant data try to find best possible answer from the context.
|
107 |
+
4. Do NOT generate speculative answers.
|
108 |
+
5. if the generated answer don't adress the question then try to find the best possible answer from the context you can add more releavnt context to the answer.
|
109 |
+
|
110 |
+
Question: {question}
|
111 |
+
Context: {context}
|
112 |
+
Answer:
|
113 |
+
"""
|
114 |
+
|
115 |
+
)
|
116 |
+
return (
|
117 |
+
{"context": retriever | format_docs, "question": RunnablePassthrough()}
|
118 |
+
| prompt_template
|
119 |
+
| model
|
120 |
+
| StrOutputParser()
|
121 |
+
)
|
122 |
+
|
123 |
+
def format_docs(docs):
|
124 |
+
return "\n\n".join(doc.page_content for doc in docs)
|
125 |
+
|
126 |
+
def clean_response(response):
|
127 |
+
return re.sub(r'<think>.*?</think>', '', response, flags=re.DOTALL).strip()
|
128 |
+
|
129 |
+
|
130 |
+
|
131 |
+
@track()
|
132 |
+
def llm_chain(input_text):
|
133 |
+
try:
|
134 |
+
context = "\n".join(doc.page_content for doc in retriever.invoke(input_text))
|
135 |
+
response = "".join(chunk for chunk in rag_chain.stream(input_text) if isinstance(chunk, str))
|
136 |
+
return {"response": clean_response(response), "context_used": context}
|
137 |
+
except Exception as e:
|
138 |
+
return {"error": str(e)}
|
139 |
+
|
140 |
+
def evaluation_task(x):
|
141 |
+
try:
|
142 |
+
result = llm_chain(x['user_question'])
|
143 |
+
return {"input": x['user_question'], "output": result["response"], "context": result["context_used"], "expected": x['expected_output']}
|
144 |
+
except Exception as e:
|
145 |
+
return {"input": x['user_question'], "output": "", "context": x['expected_output']}
|
146 |
+
|
147 |
+
# experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
|
148 |
+
# metrics = [Hallucination(model=model1), AnswerRelevance(model=model1)]
|
149 |
+
|
150 |
+
|
151 |
+
@app.post("/run_evaluation/")
|
152 |
+
@backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
|
153 |
+
def run_evaluation():
|
154 |
+
experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
|
155 |
+
metrics = [Hallucination(), AnswerRelevance()]
|
156 |
+
try:
|
157 |
+
evaluate(
|
158 |
+
experiment_name=experiment_name,
|
159 |
+
dataset=dataset,
|
160 |
+
task=evaluation_task,
|
161 |
+
scoring_metrics=metrics,
|
162 |
+
experiment_config={"model": model},
|
163 |
+
task_threads=2
|
164 |
+
)
|
165 |
+
return {"message": "Evaluation completed successfully"}
|
166 |
+
except Exception as e:
|
167 |
+
raise HTTPException(status_code=500, detail=str(e))
|
168 |
+
|
169 |
+
|
170 |
+
# @backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
|
171 |
+
# def run_evaluation():
|
172 |
+
# return evaluate(experiment_name=experiment_name, dataset=dataset, task=evaluation_task, scoring_metrics=metrics, experiment_config={"model": model}, task_threads=2)
|
173 |
+
|
174 |
+
# run_evaluation()
|
175 |
+
|
176 |
+
# Create Vector Database
|
177 |
+
def create_db():
|
178 |
+
source = r'AI Agent'
|
179 |
+
markdown_content = load_documents(source)
|
180 |
+
setup_vector_store(markdown_content)
|
181 |
+
return "Database created successfully"
|
182 |
+
|
183 |
+
embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
184 |
+
vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
|
185 |
+
retriever = vectorstore.as_retriever( search_kwargs={'k': 2})
|
186 |
+
rag_chain = create_rag_chain(retriever)
|
187 |
+
|
188 |
+
@track()
|
189 |
+
@app.get("/query/")
|
190 |
+
def chain(input_text: str = Query(..., description="Enter your question")):
|
191 |
+
try:
|
192 |
+
# def generate():
|
193 |
+
# for chunk in rag_chain.stream(input_text):
|
194 |
+
# if isinstance(chunk, str):
|
195 |
+
# yield chunk
|
196 |
+
def generate():
|
197 |
+
buffer = "" # Temporary buffer to hold chunks until `</think>` is found
|
198 |
+
start_sending = False
|
199 |
+
|
200 |
+
for chunk in rag_chain.stream(input_text):
|
201 |
+
# if isinstance(chunk, str):
|
202 |
+
# buffer += chunk # Append chunk to buffer
|
203 |
+
|
204 |
+
# # Check if `</think>` is found
|
205 |
+
# if "</think>" in buffer:
|
206 |
+
# start_sending = True
|
207 |
+
# # Yield everything after `</think>` (including `</think>` itself)
|
208 |
+
# yield buffer.split("</think>", 1)[1]
|
209 |
+
# buffer = "" # Clear the buffer after sending the first response
|
210 |
+
# elif start_sending:
|
211 |
+
yield chunk # Continue yielding after the `</think>` tag
|
212 |
+
return StreamingResponse(generate(), media_type="text/plain")
|
213 |
+
|
214 |
+
except Exception as e:
|
215 |
+
raise HTTPException(status_code=500, detail=str(e))
|
216 |
+
@app.get("/")
|
217 |
+
def read_root():
|
218 |
+
return {"message": "Welcome to the AI Assistant API!"}
|
219 |
+
|
220 |
+
if __name__ == "__main__":
|
221 |
+
# start my fastapi app
|
222 |
+
import uvicorn
|
223 |
+
uvicorn.run(app, host="127.0.0.1", port=7860)
|
224 |
+
|
225 |
+
|
226 |
+
# questions=[ "Is the website accessible through mobile also? please tell the benefits of it","How do I register for a new connection?","how to make payments?",]
|
227 |
+
# # Questions for retrieval
|
228 |
+
# # Answer questions
|
229 |
+
# create_db()
|
230 |
+
# # Load Vector Store
|
231 |
+
# embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
232 |
+
# vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
|
233 |
+
# retriever = vectorstore.as_retriever( search_kwargs={'k': 3})
|
234 |
+
# rag_chain = create_rag_chain(retriever)
|
235 |
+
|
236 |
+
# for question in questions:
|
237 |
+
# print(f"Question: {question}")
|
238 |
+
# for chunk in rag_chain.stream(question):
|
239 |
+
# print(chunk, end="", flush=True)
|
240 |
+
# print("\n" + "-" * 50 + "\n")
|
requirements.txt
ADDED
Binary file (8.09 kB). View file
|
|
start.sh
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash
|
2 |
+
|
3 |
+
# Set environment variables for optimization
|
4 |
+
export OMP_NUM_THREADS=4
|
5 |
+
export MKL_NUM_THREADS=4
|
6 |
+
export CUDA_VISIBLE_DEVICES=0,1
|
7 |
+
|
8 |
+
# Start Ollama in the background
|
9 |
+
ollama serve &
|
10 |
+
|
11 |
+
# Pull the model if not already present
|
12 |
+
if ! ollama list | grep -q "deepseek-r1:7b"; then
|
13 |
+
ollama pull deepseek-r1:7b
|
14 |
+
fi
|
15 |
+
if ! ollama list | grep -q "nomic-embed-text"; then
|
16 |
+
ollama pull nomic-embed-text
|
17 |
+
fi
|
18 |
+
# Wait for Ollama to start up
|
19 |
+
max_attempts=30
|
20 |
+
attempt=0
|
21 |
+
while ! curl -s http://localhost:11434/api/tags >/dev/null; do
|
22 |
+
sleep 1
|
23 |
+
attempt=$((attempt + 1))
|
24 |
+
if [ $attempt -eq $max_attempts ]; then
|
25 |
+
echo "Ollama failed to start within 30 seconds. Exiting."
|
26 |
+
exit 1
|
27 |
+
fi
|
28 |
+
done
|
29 |
+
|
30 |
+
echo "Ollama is ready."
|
31 |
+
|
32 |
+
# Print the API URL
|
33 |
+
echo "API is running on: http://0.0.0.0:7860"
|
34 |
+
|
35 |
+
# Start FastAPI in the background
|
36 |
+
uvicorn app:app --host 0.0.0.0 --port 7860 --workers 4 --limit-concurrency 20 &
|
37 |
+
|
38 |
+
# Start Streamlit for UI
|
39 |
+
streamlit run streamlit_app.py --server.port 8501 --server.address 0.0.0.0
|
streamlit_app.py
ADDED
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
import streamlit as st
|
3 |
+
import requests
|
4 |
+
import re # For space cleanup
|
5 |
+
|
6 |
+
st.set_page_config(page_title="AI Chatbot", layout="centered")
|
7 |
+
st.title("🤖 AI Chatbot")
|
8 |
+
|
9 |
+
if "messages" not in st.session_state:
|
10 |
+
st.session_state.messages = []
|
11 |
+
|
12 |
+
# Function to query AI API and stream response
|
13 |
+
def query_ai(question):
|
14 |
+
url = "http://127.0.0.1:7860/query/"
|
15 |
+
params = {"input_text": question}
|
16 |
+
|
17 |
+
with requests.get(url, params=params, stream=True) as response:
|
18 |
+
if response.status_code == 200:
|
19 |
+
full_response = ""
|
20 |
+
for chunk in response.iter_content(chunk_size=1024):
|
21 |
+
if chunk:
|
22 |
+
text_chunk = chunk.decode("utf-8")
|
23 |
+
full_response += text_chunk
|
24 |
+
yield full_response # Streamed response
|
25 |
+
|
26 |
+
# Custom CSS for spacing fix
|
27 |
+
st.markdown("""
|
28 |
+
<style>
|
29 |
+
.chat-box {
|
30 |
+
background-color: #1e1e1e;
|
31 |
+
padding: 12px;
|
32 |
+
border-radius: 10px;
|
33 |
+
margin-top: 5px;
|
34 |
+
font-size: 154x;
|
35 |
+
font-family: monospace;
|
36 |
+
white-space: pre-wrap;
|
37 |
+
word-wrap: break-word;
|
38 |
+
line-height: 1.2;
|
39 |
+
color: #ffffff;
|
40 |
+
}
|
41 |
+
</style>
|
42 |
+
""", unsafe_allow_html=True)
|
43 |
+
|
44 |
+
user_input = st.text_input("Ask a question:", "", key="user_input")
|
45 |
+
submit_button = st.button("Submit")
|
46 |
+
|
47 |
+
if submit_button and user_input:
|
48 |
+
st.session_state.messages.append({"role": "user", "content": user_input})
|
49 |
+
|
50 |
+
# Placeholder for streaming
|
51 |
+
response_container = st.empty()
|
52 |
+
full_response = ""
|
53 |
+
|
54 |
+
with st.spinner("🤖 AI is thinking..."):
|
55 |
+
for chunk in query_ai(user_input):
|
56 |
+
full_response = chunk
|
57 |
+
response_container.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
|
58 |
+
|
59 |
+
response_container.empty() # Hides the streamed "Thinking" response after completion
|
60 |
+
|
61 |
+
# Extract refined answer after "</think>"
|
62 |
+
if "</think>" in full_response:
|
63 |
+
refined_response = full_response.split("</think>", 1)[-1].strip()
|
64 |
+
else:
|
65 |
+
refined_response = full_response # Fallback if </think> is missing
|
66 |
+
|
67 |
+
# Remove extra newlines and excessive spaces
|
68 |
+
refined_response = re.sub(r'\n\s*\n', '\n', refined_response.strip())
|
69 |
+
|
70 |
+
# Expandable AI Thought Process Box
|
71 |
+
with st.expander("🤖 AI's Thought Process (Click to Expand)"):
|
72 |
+
st.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
|
73 |
+
|
74 |
+
# Display refined answer with clean formatting
|
75 |
+
st.write("Answer:")
|
76 |
+
st.markdown(refined_response, unsafe_allow_html=True)
|