Spaces:
Sleeping
Sleeping
Update agent.py
Browse filesupdating the prompt and the adding a new tools for searching
agent.py
CHANGED
@@ -11,17 +11,16 @@ initial_prompt = "How can I help you today?"
|
|
11 |
|
12 |
prompt = """
|
13 |
[
|
14 |
-
{"role": "system", "content": "You are
|
15 |
-
{"role": "user", "content": "
|
|
|
16 |
If the search results are irrelevant to the question respond with *** I do not have enough information to answer this question.***
|
17 |
Search results may include tables in a markdown format. When answering a question using a table be careful about which rows and columns contain the answer and include all relevant information from the relevant rows and columns that the query is asking about.
|
18 |
-
Do not cobble facts together from multiple search results, instead summarize the main facts into a consistent and easy to understand response.
|
19 |
Do not base your response on information or knowledge that is not in the search results.
|
20 |
Make sure your response is answering the query asked. If the query is related to an entity (such as a person or place), make sure you use search results related to that entity.
|
21 |
-
For queries where only a short answer is required, you can give a brief response.
|
22 |
Consider that each search result is a partial segment from a bigger text, and may be incomplete.
|
23 |
Your output should always be in a single language - the $vectaraLangName language. Check spelling and grammar for the $vectaraLangName language.
|
24 |
-
Search results for the query *** $vectaraQuery***, are listed below, some are text, some MAY be tables in
|
25 |
#foreach ($qResult in $vectaraQueryResultsDeduped)
|
26 |
[$esc.java($foreach.index + 1)]
|
27 |
#if($qResult.hasTable())
|
@@ -31,8 +30,9 @@ prompt = """
|
|
31 |
$qResult.getText()
|
32 |
#end
|
33 |
#end
|
34 |
-
Generate a coherent response (but no more than $vectaraOutChars characters) to the query *** $vectaraQuery ***
|
35 |
-
|
|
|
36 |
Only cite relevant search results in your answer following these specific instructions: $vectaraCitationInstructions
|
37 |
If the search results are irrelevant to the query, respond with ***I do not have enough information to answer this question.***. Respond always in the $vectaraLangName language, and only in that language."}
|
38 |
]
|
@@ -42,11 +42,11 @@ def create_assistant_tools(cfg):
|
|
42 |
|
43 |
|
44 |
class QueryPublicationsArgs(BaseModel):
|
45 |
-
query: str = Field(..., description="The user query, always in the form of a question",
|
|
|
46 |
|
47 |
vec_factory = VectaraToolFactory(vectara_api_key=cfg.api_key,
|
48 |
-
|
49 |
-
vectara_corpus_id=cfg.corpus_id)
|
50 |
summarizer = 'vectara-summary-table-md-query-ext-jan-2025-gpt-4o'
|
51 |
ask_publications = vec_factory.create_rag_tool(
|
52 |
tool_name = "ask_publications",
|
@@ -54,28 +54,77 @@ def create_assistant_tools(cfg):
|
|
54 |
Responds to an user question about a particular result, based on the publications.
|
55 |
""",
|
56 |
tool_args_schema = QueryPublicationsArgs,
|
57 |
-
reranker = "multilingual_reranker_v1", rerank_k = 100,
|
58 |
-
|
59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
60 |
vectara_summarizer = summarizer,
|
61 |
include_citations = True,
|
62 |
-
vectara_prompt_text=prompt
|
|
|
|
|
63 |
)
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
tools_factory = ToolsFactory()
|
66 |
return (
|
67 |
tools_factory.standard_tools() +
|
68 |
-
[ask_publications]
|
69 |
)
|
70 |
|
71 |
def initialize_agent(_cfg, agent_progress_callback=None):
|
72 |
menarini_bot_instructions = """
|
73 |
-
- You are
|
74 |
-
-
|
75 |
-
-
|
76 |
-
|
77 |
-
-
|
78 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
"""
|
80 |
|
81 |
agent = Agent(
|
|
|
11 |
|
12 |
prompt = """
|
13 |
[
|
14 |
+
{"role": "system", "content": "You are an AI assistant that forms a coherent answer to a user query based on search results that are provided to you." },
|
15 |
+
{"role": "user", "content": "
|
16 |
+
[INSTRUCTIONS]
|
17 |
If the search results are irrelevant to the question respond with *** I do not have enough information to answer this question.***
|
18 |
Search results may include tables in a markdown format. When answering a question using a table be careful about which rows and columns contain the answer and include all relevant information from the relevant rows and columns that the query is asking about.
|
|
|
19 |
Do not base your response on information or knowledge that is not in the search results.
|
20 |
Make sure your response is answering the query asked. If the query is related to an entity (such as a person or place), make sure you use search results related to that entity.
|
|
|
21 |
Consider that each search result is a partial segment from a bigger text, and may be incomplete.
|
22 |
Your output should always be in a single language - the $vectaraLangName language. Check spelling and grammar for the $vectaraLangName language.
|
23 |
+
Search results for the query *** $vectaraQuery***, are listed below, some are text, some MAY be tables in markdown format.
|
24 |
#foreach ($qResult in $vectaraQueryResultsDeduped)
|
25 |
[$esc.java($foreach.index + 1)]
|
26 |
#if($qResult.hasTable())
|
|
|
30 |
$qResult.getText()
|
31 |
#end
|
32 |
#end
|
33 |
+
Generate a coherent response (but no more than $vectaraOutChars characters) to the query *** $vectaraQuery *** using information and facts in the search results provided.
|
34 |
+
Give a slight preference to search results that appear earlier in the list.
|
35 |
+
Include statistical and numerical evidence to support and contextualize your response.
|
36 |
Only cite relevant search results in your answer following these specific instructions: $vectaraCitationInstructions
|
37 |
If the search results are irrelevant to the query, respond with ***I do not have enough information to answer this question.***. Respond always in the $vectaraLangName language, and only in that language."}
|
38 |
]
|
|
|
42 |
|
43 |
|
44 |
class QueryPublicationsArgs(BaseModel):
|
45 |
+
query: str = Field(..., description="The user query, always in the form of a question",
|
46 |
+
examples=["what are the risks reported?", "which drug was use on the and how big was the population?"])
|
47 |
|
48 |
vec_factory = VectaraToolFactory(vectara_api_key=cfg.api_key,
|
49 |
+
vectara_corpus_key=cfg.corpus_key)
|
|
|
50 |
summarizer = 'vectara-summary-table-md-query-ext-jan-2025-gpt-4o'
|
51 |
ask_publications = vec_factory.create_rag_tool(
|
52 |
tool_name = "ask_publications",
|
|
|
54 |
Responds to an user question about a particular result, based on the publications.
|
55 |
""",
|
56 |
tool_args_schema = QueryPublicationsArgs,
|
57 |
+
# reranker = "multilingual_reranker_v1", rerank_k = 100,
|
58 |
+
reranker = "chain", rerank_k = 100,
|
59 |
+
rerank_chain = [
|
60 |
+
{
|
61 |
+
"type": "multilingual_reranker_v1",
|
62 |
+
# "cutoff": 0.2
|
63 |
+
},
|
64 |
+
{
|
65 |
+
"type": "mmr",
|
66 |
+
"diversity_bias": 0.2,
|
67 |
+
"limit": 50
|
68 |
+
}
|
69 |
+
],
|
70 |
+
n_sentences_before = 2, n_sentences_after = 2, lambda_val = 0.005,
|
71 |
+
summary_num_results = 15,
|
72 |
vectara_summarizer = summarizer,
|
73 |
include_citations = True,
|
74 |
+
vectara_prompt_text=prompt,
|
75 |
+
save_history = True,
|
76 |
+
verbose=False
|
77 |
)
|
78 |
|
79 |
+
search_publications = vec_factory.create_search_tool(
|
80 |
+
tool_name = "search_publications",
|
81 |
+
tool_description = """
|
82 |
+
Returns matching publications to a user query.
|
83 |
+
""",
|
84 |
+
tool_args_schema = QueryPublicationsArgs,
|
85 |
+
reranker = "chain", rerank_k = 100,
|
86 |
+
rerank_chain = [
|
87 |
+
{
|
88 |
+
"type": "multilingual_reranker_v1",
|
89 |
+
# "cutoff": 0.2
|
90 |
+
},
|
91 |
+
{
|
92 |
+
"type": "mmr",
|
93 |
+
"diversity_bias": 0.2,
|
94 |
+
"limit": 50
|
95 |
+
}
|
96 |
+
],
|
97 |
+
# reranker = "multilingual_reranker_v1", rerank_k = 100,
|
98 |
+
n_sentences_before = 2, n_sentences_after = 2, lambda_val = 0.005,
|
99 |
+
save_history = True,
|
100 |
+
verbose=True
|
101 |
+
)
|
102 |
+
|
103 |
+
|
104 |
tools_factory = ToolsFactory()
|
105 |
return (
|
106 |
tools_factory.standard_tools() +
|
107 |
+
[ask_publications, search_publications]
|
108 |
)
|
109 |
|
110 |
def initialize_agent(_cfg, agent_progress_callback=None):
|
111 |
menarini_bot_instructions = """
|
112 |
+
- You are an expert statistician and clinical trial data analyst with extensive experience in designing, analyzing, and interpreting clinical research data.
|
113 |
+
- Your responses should be technically rigorous, data-driven, and written for an audience familiar with advanced statistical methodologies, regulatory standards, and the nuances of clinical trial design.
|
114 |
+
- Call the ask_publications tool to retreive information to answer the user query.
|
115 |
+
If the initial query lacks comprehensive data, continue to query ask_publications with refined search parameters until you retrieve all necessary numerical details
|
116 |
+
- Call the search_publications tool to retreive a list of publications that may contain the information needed to answer the user query.
|
117 |
+
The results include the document_id of each publication, and metadata.
|
118 |
+
- When responding to queries:
|
119 |
+
1) Use precise statistical terminology (e.g., randomization, blinding, intention-to-treat, type I/II error, p-values, confidence intervals, Bayesian methods, etc.)
|
120 |
+
and reference common methodologies or guidelines where applicable (e.g., CONSORT, FDA, EMA).
|
121 |
+
2) Your responses must include contextual information such as sample size and population characteristics. This nuance is crucial in clinical trial analysis.
|
122 |
+
When considering or reporting sample sizes, consider participants who were eligible for the study, those who were randomized, and those who completed the study.
|
123 |
+
If it's unclear which one is being referred to, clarify this in your response or ask the user for clarification.
|
124 |
+
3) Provide clear explanations of statistical concepts, including assumptions, potential biases, and limitations in the context of clinical trial data.
|
125 |
+
4) Ensure that your analysis is evidence-based and reflects current best practices in the field of clinical research and data analysis.
|
126 |
+
5) Before finalizing your answer, review the analysis to ensure that all relevant data has been incorporated and that your conclusions are well-supported by the evidence.
|
127 |
+
6) Provide sources and citations for all data and statistical information included in your responses, as provided in the response from the tools.
|
128 |
"""
|
129 |
|
130 |
agent = Agent(
|