datafreak commited on
Commit
eee9fe9
·
verified ·
1 Parent(s): b8fa953

Dockerfile and other imp files

Browse files
Files changed (8) hide show
  1. Dockerfile +17 -0
  2. api_docs.md +168 -0
  3. main.py +143 -0
  4. requirements.txt +0 -0
  5. retrieval.py +54 -0
  6. templates.py +83 -0
  7. test.py +21 -0
  8. tools.py +29 -0
Dockerfile ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use Python 3.10.6 image
2
+ FROM python:3.10
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Copy the requirements file
8
+ COPY requirements.txt .
9
+
10
+ # Install dependencies
11
+ RUN pip install --no-cache-dir -r requirements.txt
12
+
13
+ # Copy the rest of your application files
14
+ COPY . .
15
+
16
+ # Command to run your FastAPI app
17
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]
api_docs.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ # API Documentation for Legal Assistance
5
+
6
+ ## Overview
7
+ This API provides legal assistance by allowing users to upload PDF documents and submit queries for various legal services, including legal advisory, report generation, and case outcome prediction.
8
+
9
+ ## Base URL
10
+ ```
11
+ https://navilaw-ai.onrender.com/legal-assistance/
12
+ ```
13
+
14
+ ## HTTP Method
15
+ `POST`
16
+
17
+ ## Endpoint
18
+ `/legal-assistance/`
19
+
20
+ ## Request Parameters
21
+
22
+ ### Form Parameters
23
+
24
+ | Parameter | Type | Required | Description |
25
+ |-----------|-------------|----------|-------------|
26
+ | `query` | `string` | Yes | A string representing the legal query the user wishes to ask. |
27
+ | `option` | `string` | Yes | A string indicating the type of legal assistance required. Possible values are "Legal Advisory", "Legal Report Generation", and "Case Outcome Prediction". |
28
+ | `files` | `List[UploadFile]` | Yes | A list of PDF files containing legal documents to be analyzed. |
29
+
30
+ ### Example Request
31
+ ```http
32
+ POST /legal-assistance/ HTTP/1.1
33
+ Host: https://navilaw-ai.onrender.com/
34
+ Content-Type: multipart/form-data
35
+
36
+ --boundary
37
+ Content-Disposition: form-data; name="query"
38
+
39
+ What are the possible outcomes of my case?
40
+ --boundary
41
+ Content-Disposition: form-data; name="option"
42
+
43
+ Case Outcome Prediction
44
+ --boundary
45
+ Content-Disposition: form-data; name="files"; filename="legal_case.pdf"
46
+ Content-Type: application/pdf
47
+
48
+ <PDF file content>
49
+ --boundary--
50
+ ```
51
+
52
+ ## Response Format
53
+
54
+ ### Successful Response
55
+ - **Status Code:** `200 OK`
56
+ - **Content-Type:** `application/json`
57
+
58
+ #### Response Body
59
+ - **For Legal Advisory:**
60
+ ```json
61
+ {
62
+ "result": "Based on the provided documents and the legal query, here are the considerations to keep in mind regarding your case..."
63
+ }
64
+ ```
65
+
66
+ - **For Legal Report Generation:**
67
+ ```json
68
+ {
69
+ "report": "Legal Report:\n\n1. Introduction\n2. Case Details\n3. Analysis\n4. Conclusion"
70
+ }
71
+ ```
72
+
73
+ - **For Case Outcome Prediction:**
74
+ ```json
75
+ {
76
+ "prediction": "Based on the analysis of the legal precedents and the court's decision in a similar case, there is a 70% chance of a favorable outcome for Fast Retail in their lawsuit against Tech Solutions. The court ruled in favor of Fast Retail, ordering Tech Solutions to pay damages of ₹60,00,000. The uncertainty lies in the court's consideration of not all claimed damages were directly attributable to Tech Solutions' breach. It's crucial to consider this when predicting the outcome."
77
+ }
78
+ ```
79
+
80
+ ### Error Response
81
+ - **Status Code:** `400 Bad Request`
82
+ - **Content-Type:** `application/json`
83
+
84
+ #### Response Body
85
+ ```json
86
+ {
87
+ "detail": "Please upload at least one PDF file."
88
+ }
89
+ ```
90
+
91
+ #### Possible Error Messages
92
+ - **If no files are uploaded:**
93
+ ```json
94
+ {
95
+ "detail": "Please upload at least one PDF file."
96
+ }
97
+ ```
98
+
99
+ - **If no query is provided:**
100
+ ```json
101
+ {
102
+ "detail": "Please enter a query."
103
+ }
104
+ ```
105
+
106
+ - **If an invalid option is selected:**
107
+ ```json
108
+ {
109
+ "detail": "Invalid option selected."
110
+ }
111
+ ```
112
+
113
+ ## Sample Inputs and Outputs
114
+
115
+ ### 1. Legal Advisory
116
+ #### Request
117
+ ```http
118
+ POST /legal-assistance/
119
+ ```
120
+ With the following form data:
121
+ - **query:** "What are the implications of the new law on my case?"
122
+ - **option:** "Legal Advisory"
123
+ - **files:** (Upload PDF: `law_document.pdf`)
124
+
125
+ #### Response
126
+ ```json
127
+ {
128
+ "result": "The new law may affect your case in several ways, particularly regarding..."
129
+ }
130
+ ```
131
+
132
+ ### 2. Legal Report Generation
133
+ #### Request
134
+ ```http
135
+ POST /legal-assistance/
136
+ ```
137
+ With the following form data:
138
+ - **query:** "Generate a report on the recent legal changes."
139
+ - **option:** "Legal Report Generation"
140
+ - **files:** (Upload PDF: `legal_changes.pdf`)
141
+
142
+ #### Response
143
+ ```json
144
+ {
145
+ "report": "Legal Report:\n\n1. Introduction\n2. Summary of Changes\n3. Implications\n4. Conclusion"
146
+ }
147
+ ```
148
+
149
+ ### 3. Case Outcome Prediction
150
+ #### Request
151
+ ```http
152
+ POST /legal-assistance/
153
+ ```
154
+ With the following form data:
155
+ - **query:** "What is the likelihood of winning my case based on previous rulings?"
156
+ - **option:** "Case Outcome Prediction"
157
+ - **files:** (Upload PDF: `previous_rulings.pdf`)
158
+
159
+ #### Response
160
+ ```json
161
+ {
162
+ "prediction": "Based on the analysis of the legal precedents and the court's decision in a similar case, there is a 70% chance of a favorable outcome for Fast Retail in their lawsuit against Tech Solutions..."
163
+ }
164
+ ```
165
+
166
+ ---
167
+
168
+
main.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from fastapi import FastAPI, UploadFile, File, Form, HTTPException
3
+ from fastapi.middleware.cors import CORSMiddleware
4
+ from langchain_groq import ChatGroq
5
+ from PyPDF2 import PdfReader
6
+ from langchain.docstore.document import Document
7
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
8
+ from langchain_core.output_parsers import StrOutputParser
9
+ from langchain_core.runnables import RunnablePassthrough, RunnableParallel
10
+ from langgraph.prebuilt import create_react_agent
11
+ from retrieval import create_retriever
12
+ from templates import advisor_template, predictor_template, generator_template
13
+ from langchain.tools.retriever import create_retriever_tool
14
+ from tools import tavily_tool
15
+ from dotenv import load_dotenv
16
+ from typing import List
17
+
18
+ load_dotenv()
19
+ groq_api_key = os.getenv("GROQ_API_KEY")
20
+ chat = ChatGroq(model = "llama-3.3-70b-versatile", api_key=groq_api_key)
21
+ app = FastAPI()
22
+ app.add_middleware(
23
+ CORSMiddleware,
24
+ allow_origins=["*"],
25
+ allow_credentials=True,
26
+ allow_methods=["*"],
27
+ allow_headers=["*"],
28
+ )
29
+
30
+ @app.get("/")
31
+ async def read_root():
32
+ return {"message": "Welcome to the Legal Research API! Please use one of the endpoints for requests."}
33
+
34
+ def process_files(files: List[UploadFile]):
35
+ if not files:
36
+ raise HTTPException(status_code=400, detail="Please upload at least one PDF file.")
37
+ docs = []
38
+ for uploaded_file in files:
39
+ reader = PdfReader(uploaded_file.file)
40
+ text = ""
41
+ for page in reader.pages:
42
+ text += page.extract_text()
43
+ docs.append(Document(page_content=text, metadata={"source": uploaded_file.filename}))
44
+ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
45
+ pdf_content = text_splitter.split_documents(docs)
46
+ return pdf_content
47
+
48
+ def setup_retriever(pdf_content):
49
+ retriever = create_retriever(pdf_content)
50
+ retrieval_tool = create_retriever_tool(
51
+ retriever,
52
+ "Pdf_content_retriever",
53
+ "Searches and returns excerpts from the set of PDF docs.",
54
+ )
55
+ return retriever, retrieval_tool
56
+
57
+ def setup_agents(tools):
58
+ advisor_graph = create_react_agent(chat, tools=tools, state_modifier=advisor_template)
59
+ predictor_graph = create_react_agent(chat, tools=tools, state_modifier=predictor_template)
60
+ return advisor_graph, predictor_graph
61
+
62
+ @app.post("/legal-assistance/")
63
+ async def legal_assistance(
64
+ query: str = Form(...),
65
+ option: str = Form(...),
66
+ files: List[UploadFile] = File(...)
67
+ ):
68
+ if not query:
69
+ raise HTTPException(status_code=400, detail="Please enter a query.")
70
+ pdf_content = process_files(files)
71
+ retriever, retrieval_tool = setup_retriever(pdf_content)
72
+ tools = [tavily_tool, retrieval_tool]
73
+ advisor_graph, predictor_graph = setup_agents(tools)
74
+ inputs = {"messages": [("human", query)]}
75
+ if option == "Legal Advisory":
76
+ async for chunk in advisor_graph.astream(inputs, stream_mode="values"):
77
+ final_result = chunk
78
+ result = final_result["messages"][-1].content
79
+ return {"result": result}
80
+ elif option == "Legal Report Generation":
81
+ set_ret = RunnableParallel({"context": retriever, "query": RunnablePassthrough()})
82
+ rag_chain = set_ret | generator_template | chat | StrOutputParser()
83
+ report = rag_chain.invoke(query)
84
+ return {"report": report}
85
+ elif option == "Case Outcome Prediction":
86
+ async for chunk in predictor_graph.astream(inputs, stream_mode="values"):
87
+ final_prediction = chunk
88
+ prediction = final_prediction["messages"][-1]
89
+ return {"prediction": prediction}
90
+ else:
91
+ raise HTTPException(status_code=400, detail="Invalid option selected.")
92
+
93
+ @app.post("/legal-advisory/")
94
+ async def legal_advisory_endpoint(
95
+ query: str = Form(...),
96
+ files: List[UploadFile] = File(...)
97
+ ):
98
+ if not query:
99
+ raise HTTPException(status_code=400, detail="Please enter a query.")
100
+ pdf_content = process_files(files)
101
+ retriever, retrieval_tool = setup_retriever(pdf_content)
102
+ tools = [tavily_tool, retrieval_tool]
103
+ advisor_graph, _ = setup_agents(tools)
104
+ inputs = {"messages": [("human", query)]}
105
+ async for chunk in advisor_graph.astream(inputs, stream_mode="values"):
106
+ final_result = chunk
107
+ result = final_result["messages"][-1].content
108
+ return {"result": result}
109
+
110
+ @app.post("/case-outcome-prediction/")
111
+ async def case_outcome_prediction_endpoint(
112
+ query: str = Form(...),
113
+ files: List[UploadFile] = File(...)
114
+ ):
115
+ if not query:
116
+ raise HTTPException(status_code=400, detail="Please enter a query.")
117
+ pdf_content = process_files(files)
118
+ retriever, retrieval_tool = setup_retriever(pdf_content)
119
+ tools = [tavily_tool, retrieval_tool]
120
+ _, predictor_graph = setup_agents(tools)
121
+ inputs = {"messages": [("human", query)]}
122
+ async for chunk in predictor_graph.astream(inputs, stream_mode="values"):
123
+ final_prediction = chunk
124
+ prediction = final_prediction["messages"][-1].content
125
+ return {"prediction": prediction}
126
+
127
+ @app.post("/report-generator/")
128
+ async def report_generator_endpoint(
129
+ query: str = Form(...),
130
+ files: List[UploadFile] = File(...)
131
+ ):
132
+ if not query:
133
+ raise HTTPException(status_code=400, detail="Please enter a query.")
134
+ pdf_content = process_files(files)
135
+ retriever, _ = setup_retriever(pdf_content)
136
+ set_ret = RunnableParallel({"context": retriever, "query": RunnablePassthrough()})
137
+ rag_chain = set_ret | generator_template | chat | StrOutputParser()
138
+ report = rag_chain.invoke(query)
139
+ return {"report": report}
140
+
141
+ if __name__ == "__main__":
142
+ import uvicorn
143
+ uvicorn.run(app, host="0.0.0.0", port=10000)
requirements.txt ADDED
Binary file (2.54 kB). View file
 
retrieval.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #RAG method
2
+ from PyPDF2 import PdfReader
3
+ from langchain.document_loaders import PyPDFLoader
4
+ from langchain.docstore.document import Document
5
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
6
+ from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings
7
+ from langchain_core.vectorstores import InMemoryVectorStore
8
+ from dotenv import load_dotenv
9
+ import os
10
+
11
+ load_dotenv()
12
+ hf_token = os.getenv("HF_TOKEN")
13
+
14
+ def load_and_chunk_pdfs(directory_path):
15
+ docs = []
16
+
17
+ for filename in os.listdir(directory_path):
18
+ if filename.endswith(".pdf"):
19
+ file_path = os.path.join(directory_path, filename)
20
+
21
+ reader = PdfReader(file_path)
22
+ text = ""
23
+ for page in reader.pages:
24
+ text += page.extract_text()
25
+
26
+ doc = Document(page_content=text, metadata={"source": filename})
27
+ docs.append(doc)
28
+
29
+ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
30
+
31
+ chunked_docs = text_splitter.split_documents(docs)
32
+ return chunked_docs
33
+
34
+
35
+ def create_retriever(documents: list):
36
+ """
37
+ Function to create and return a retriever using HuggingFace Embeddings and InMemory VectorStore.
38
+
39
+ Args:
40
+ api_key (str): Hugging Face API key.
41
+ model_name (str): The model name for sentence transformer embeddings.
42
+ documents (list): The list of documents to be embedded and added to the vectorstore.
43
+
44
+ Returns:
45
+ retriever: A retriever object to query the vector store.
46
+ """
47
+ embeddings = HuggingFaceInferenceAPIEmbeddings(api_key=hf_token, model_name="sentence-transformers/all-MiniLM-l6-v2")
48
+
49
+ vectorstore = InMemoryVectorStore(embedding=embeddings)
50
+ vectorstore.add_documents(documents)
51
+
52
+ return vectorstore.as_retriever()
53
+
54
+
templates.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel, Field
2
+ from typing import Literal
3
+ from langchain.prompts import PromptTemplate
4
+ advisor_template = """You are a legal research assistant tasked with providing
5
+ legal advice based on the given vectorstore context. If needed, conduct
6
+ additional research using the Tavily Search tool. Analyze the query for
7
+ specific legal issues, reference relevant sections of legal documents, and
8
+ ensure jurisdictional relevance. Consider conflicting interpretations or
9
+ unclear areas of law, and provide practical recommendations or next steps.
10
+ Include a disclaimer regarding the limitations of AI-generated legal advice.."""
11
+
12
+
13
+ predictor_template = """
14
+ You are a legal research assistant tasked with predicting the outcome of a
15
+ legal case using the provided vectorstore context. If needed, conduct
16
+ additional research using the Tavily Search tool. Analyze relevant legal
17
+ precedents, evidence, and arguments, and reference supporting sections from
18
+ legal documents. Provide a prediction of the case outcome with confidence
19
+ intervals (e.g., 70 percent hance of a favorable outcome), considering
20
+ jurisdictional differences. Highlight any uncertainties that could impact the
21
+ result, and include a disclaimer about the limitations of AI-generated
22
+ predictions in real-world legal decisions.
23
+ """
24
+
25
+ example_generator_template = """
26
+ ---
27
+ ### Legal Report template
28
+ **Task Overview:**
29
+ Generate a concise legal report based on the provided vectorstore according to the
30
+ context and query:
31
+ {context}
32
+
33
+ query: {query}
34
+ **Report Structure:**
35
+ 1. **Title:**
36
+ - Clear and descriptive.
37
+ 2. **Introduction:**
38
+ - State the legal issue addressed.
39
+ 3. **Legal Precedents:**
40
+ - Summarize relevant precedents that apply.
41
+ 4. **Key Findings:**
42
+ - Present significant evidence and findings.
43
+ 5. **Analysis:**
44
+ - Discuss implications and potential outcomes.
45
+ 6. **Conclusion:**
46
+ - Summarize main points and recommendations.
47
+ 7. **Disclaimer:**
48
+ - Acknowledge that the report is AI-generated and may not account for all legal factors.
49
+ ---
50
+ This streamlined template ensures clarity and professionalism without being overly detailed. Let me know if you need any adjustments!
51
+ """
52
+
53
+ generator_template = PromptTemplate.from_template(template=example_generator_template)
54
+
55
+
56
+ class LegalReportResponse(BaseModel):
57
+ """Respond to the user with this"""
58
+ return_direct: bool = False
59
+ case_summary: str = Field(description="A concise summary of the legal case")
60
+ relevant_precedents: str = Field(description="Key legal precedents or statutes relevant to the case")
61
+ evidence_analysis: str = Field(description="Summary of evidence and arguments presented by both sides")
62
+ key_findings: str = Field(description="Important findings or factors that influence the case")
63
+ conclusion: str = Field(description="A brief conclusion based on the analysis")
64
+
65
+ class CaseOutcomePredictionResponse(BaseModel):
66
+ """Respond to the user with this"""
67
+ return_direct: bool = False
68
+ outcome_prediction: str = Field(description="Predicted outcome of the case")
69
+ confidence_interval: str = Field(description="Confidence interval for the prediction (e.g., 70% chance for the plaintiff)")
70
+ jurisdiction: str = Field(description="The legal jurisdiction relevant to the prediction")
71
+ uncertainty_factors: str = Field(description="Factors that might lead to different outcomes")
72
+ disclaimer: str = Field(description="AI-generated prediction disclaimer for limitations")
73
+
74
+ class LegalAdviceResponse(BaseModel):
75
+ """Respond to the user with this"""
76
+ return_direct: bool = False
77
+ legal_issue: str = Field(description="The specific legal issue or query addressed")
78
+ advice: str = Field(description="The legal advice or recommendation provided based on the given context")
79
+ relevant_sections: str = Field(description="Relevant sections from legal documents or case law supporting the advice")
80
+ jurisdiction: str = Field(description="The jurisdiction applicable to the legal advice")
81
+ conflicting_interpretations: str = Field(description="Any conflicting interpretations or unclear areas of law")
82
+ next_steps: str = Field(description="Practical recommendations or next steps for the user to take")
83
+ disclaimer: str = Field(description="AI-generated legal advice disclaimer for limitations")
test.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+
3
+ url = "http://127.0.0.1:8000/legal-assistance/"
4
+
5
+ # List of files to upload (as a list of tuples)
6
+ files = [
7
+ ('files', ('Sample Complaint.pdf', open('input/Sample Complaint.pdf', 'rb'), 'application/pdf')),
8
+ ('files', ('Sample Contract.pdf', open('input/Sample Contract.pdf', 'rb'), 'application/pdf'))
9
+ ]
10
+
11
+ # The form data (query and option)
12
+ data = {
13
+ 'query': 'What are the possible legal options for fast retail pvt ltd.?',
14
+ 'option': 'Legal Advisory'
15
+ }
16
+
17
+ # Make the request
18
+ response = requests.post(url, files=files, data=data)
19
+
20
+ # Check the response
21
+ print(response.json())
tools.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dotenv import load_dotenv
2
+ from langchain_community.tools import TavilySearchResults
3
+ from langchain.tools.retriever import create_retriever_tool
4
+ import os
5
+ load_dotenv()
6
+
7
+ tavily_api_key = os.getenv("TAVILY_API_KEY")
8
+
9
+ from langchain_community.tools import TavilySearchResults
10
+
11
+ tavily_tool = TavilySearchResults(
12
+ max_results=5,
13
+ search_depth="advanced",
14
+ include_answer=True,
15
+ include_raw_content=True,
16
+ include_images=False,
17
+ include_domains=[
18
+ "indiankanoon.org", # Indian case law
19
+ "barandbench.com", # Legal news and updates
20
+ "legallyindia.com", # Legal developments in India
21
+ "scconline.com", # Supreme Court Cases Online
22
+ "lawtimesjournal.in", # Legal news and case analysis
23
+ "lawyersclubindia.com", # Legal community discussions
24
+ "vlex.in", # Global legal information with Indian focus
25
+ "taxmann.com" # Taxation and corporate law in India
26
+ ],
27
+ exclude_domains=["globalsearch.com", "genericnews.com"]
28
+ )
29
+