Upload 2 files
Browse files- README.md +88 -126
- requirements.txt +2 -1
README.md
CHANGED
@@ -1,9 +1,3 @@
|
|
1 |
-
---
|
2 |
-
title: ask.py
|
3 |
-
app_file: ask.py
|
4 |
-
sdk: gradio
|
5 |
-
sdk_version: 5.3.0
|
6 |
-
---
|
7 |
# ask.py
|
8 |
|
9 |
[![License](https://img.shields.io/github/license/pengfeng/ask.py)](LICENSE)
|
@@ -11,18 +5,34 @@ sdk_version: 5.3.0
|
|
11 |
A single Python program to implement the search-extract-summarize flow, similar to AI search
|
12 |
engines such as Perplexity.
|
13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
> [UPDATE]
|
15 |
>
|
|
|
|
|
|
|
16 |
> - 2024-10-22: add GradIO integation
|
17 |
> - 2024-10-21: use DuckDB for the vector search and use API for embedding
|
18 |
> - 2024-10-20: allow to specify a list of input urls
|
19 |
> - 2024-10-18: output-language and output-length parameters for LLM
|
20 |
> - 2024-10-18: date-restrict and target-site parameters for seach
|
21 |
|
22 |
-
> [!NOTE]
|
23 |
-
> Our main goal is to illustrate the basic concepts of AI search engines with the raw constructs.
|
24 |
-
> Performance or scalability is not in the scope of this program.
|
25 |
-
|
26 |
## The search-extract-summarize flow
|
27 |
|
28 |
Given a query, the program will
|
@@ -31,7 +41,9 @@ Given a query, the program will
|
|
31 |
- crawl and scape the pages for their text content
|
32 |
- chunk the text content into chunks and save them into a vectordb
|
33 |
- perform a vector search with the query and find the top 10 matched chunks
|
34 |
-
-
|
|
|
|
|
35 |
- output the answer with the references
|
36 |
|
37 |
Of course this flow is a very simplified version of the real AI search engines, but it is a good
|
@@ -50,30 +62,33 @@ For example, we can:
|
|
50 |
## Quick start
|
51 |
|
52 |
```bash
|
53 |
-
|
54 |
-
pip install -r requirements.txt
|
55 |
|
56 |
# modify .env file to set the API keys or export them as environment variables as below
|
57 |
|
58 |
# right now we use Google search API
|
59 |
-
export SEARCH_API_KEY="your-google-search-api-key"
|
60 |
-
export SEARCH_PROJECT_KEY="your-google-cx-key"
|
61 |
|
62 |
# right now we use OpenAI API
|
63 |
-
export LLM_API_KEY="your-openai-api-key"
|
64 |
|
65 |
-
#
|
66 |
-
|
|
|
67 |
|
68 |
# we can specify more parameters to control the behavior such as date_restrict and target_site
|
69 |
-
python ask.py --help
|
70 |
Usage: ask.py [OPTIONS]
|
71 |
|
72 |
-
Search web for the query and summarize the results
|
73 |
|
74 |
Options:
|
75 |
-
--web-ui Launch the web interface
|
76 |
-q, --query TEXT Query to search
|
|
|
|
|
|
|
77 |
-d, --date-restrict INTEGER Restrict search results to a specific date
|
78 |
range, default is no restriction
|
79 |
-s, --target-site TEXT Restrict search results to a specific site,
|
@@ -83,7 +98,12 @@ Options:
|
|
83 |
--url-list-file TEXT Instead of doing web search, scrape the
|
84 |
target URL list and answer the query based
|
85 |
on the content
|
86 |
-
|
|
|
|
|
|
|
|
|
|
|
87 |
-l, --log-level [DEBUG|INFO|WARNING|ERROR]
|
88 |
Set the logging level [default: INFO]
|
89 |
--help Show this message and exit.
|
@@ -96,117 +116,59 @@ Options:
|
|
96 |
- [Jinja2](https://jinja.palletsprojects.com/en/3.0.x/)
|
97 |
- [bs4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
|
98 |
- [DuckDB](https://github.com/duckdb/duckdb)
|
99 |
-
- [GradIO](https://
|
|
|
100 |
|
101 |
-
##
|
102 |
|
103 |
-
!
|
|
|
104 |
|
105 |
-
|
106 |
|
107 |
-
|
108 |
|
109 |
-
```
|
110 |
-
% python ask.py
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
✅ Running inference with context ...
|
119 |
-
|
120 |
-
# Answer
|
121 |
-
|
122 |
-
Agentic RAG (Retrieval-Augmented Generation) is needed alongside ChatGPT for several reasons:
|
123 |
-
|
124 |
-
1. **Precision and Contextual Relevance**: While ChatGPT offers generative responses, it may not
|
125 |
-
reliably provide precise answers, especially when specific, accurate information is critical[5].
|
126 |
-
Agentic RAG enhances this by integrating retrieval mechanisms that improve response context and
|
127 |
-
accuracy, allowing users to access the most relevant and recent data without the need for costly
|
128 |
-
model fine-tuning[2].
|
129 |
-
|
130 |
-
2. **Customizability**: RAG allows businesses to create tailored chatbots that can securely
|
131 |
-
reference company-specific data[2]. In contrast, ChatGPT’s broader capabilities may not be
|
132 |
-
directly suited for specialized, domain-specific questions without comprehensive customization[3].
|
133 |
-
|
134 |
-
3. **Complex Query Handling**: RAG can be optimized for complex queries and can be adjusted to
|
135 |
-
work better with specific types of inputs, such as comparing and contrasting information, a task
|
136 |
-
where ChatGPT may struggle under certain circumstances[9]. This level of customization can lead to
|
137 |
-
better performance in niche applications where precise retrieval of information is crucial.
|
138 |
-
|
139 |
-
4. **Asynchronous Processing Capabilities**: Future agentic systems aim to integrate asynchronous
|
140 |
-
handling of actions, allowing for parallel processing and reducing wait times for retrieval and
|
141 |
-
computation, which is a limitation in the current form of ChatGPT[7]. This advancement would enhance
|
142 |
-
overall efficiency and responsiveness in conversations.
|
143 |
-
|
144 |
-
5. **Incorporating Retrieved Information Effectively**: Using RAG can significantly improve how
|
145 |
-
retrieved information is utilized within a conversation. By effectively managing the context and
|
146 |
-
relevance of retrieved documents, RAG helps in framing prompts that can guide ChatGPT towards
|
147 |
-
delivering more accurate responses[10].
|
148 |
-
|
149 |
-
In summary, while ChatGPT excels in generating conversational responses, agentic RAG brings
|
150 |
-
precision, customization, and efficiency that can significantly enhance the overall conversational
|
151 |
-
AI experience.
|
152 |
-
|
153 |
-
# References
|
154 |
-
|
155 |
-
[1] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
|
156 |
-
[2] https://www.linkedin.com/posts/brianjuliusdc_dax-powerbi-chatgpt-activity-7235953280177041408-wQqq
|
157 |
-
[3] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
|
158 |
-
[4] https://community.openai.com/t/prompt-engineering-for-rag/621495
|
159 |
-
[5] https://www.ben-evans.com/benedictevans/2024/6/8/building-ai-products
|
160 |
-
[6] https://community.openai.com/t/prompt-engineering-for-rag/621495
|
161 |
-
[7] https://www.linkedin.com/posts/kurtcagle_agentic-rag-personalizing-and-optimizing-activity-7198097129993613312-z7Sm
|
162 |
-
[8] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
|
163 |
-
[9] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
|
164 |
-
[10] https://community.openai.com/t/prompt-engineering-for-rag/621495
|
165 |
```
|
166 |
|
167 |
-
###
|
168 |
|
169 |
-
|
170 |
-
|
171 |
-
|
|
|
172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
173 |
```
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
|
179 |
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
# Answer
|
184 |
-
|
185 |
-
OpenAI Swarm Framework is an experimental platform designed for building, orchestrating, and
|
186 |
-
deploying multi-agent systems, enabling multiple AI agents to collaborate on complex tasks. It contrasts
|
187 |
-
with traditional single-agent models by facilitating agent interaction and coordination, thus enhancing
|
188 |
-
efficiency[5][9]. The framework provides developers with a way to orchestrate these agent systems in
|
189 |
-
a lightweight manner, leveraging Node.js for scalable applications[1][4].
|
190 |
-
|
191 |
-
One implementation of this framework is Swarm.js, which serves as a Node.js SDK, allowing users to
|
192 |
-
create and manage agents that perform tasks and hand off conversations. Swarm.js is positioned as
|
193 |
-
an educational tool, making it accessible for both beginners and experts, although it may still contain
|
194 |
-
bugs and is currently lightweight[1][3][7]. This new approach emphasizes multi-agent collaboration and is
|
195 |
-
well-suited for back-end development, requiring some programming expertise for effective implementation[9].
|
196 |
-
|
197 |
-
Overall, OpenAI Swarm facilitates a shift in how AI systems can collaborate, differing from existing
|
198 |
-
OpenAI tools by focusing on backend orchestration rather than user-interactive front-end applications[9].
|
199 |
-
|
200 |
-
# References
|
201 |
-
|
202 |
-
[1] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
|
203 |
-
[2] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
|
204 |
-
[3] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
|
205 |
-
[4] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
|
206 |
-
[5] https://community.openai.com/t/swarm-some-initial-insights/976602
|
207 |
-
[6] https://community.openai.com/t/swarm-some-initial-insights/976602
|
208 |
-
[7] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
|
209 |
-
[8] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
|
210 |
-
[9] https://community.openai.com/t/swarm-some-initial-insights/976602
|
211 |
-
[10] https://community.openai.com/t/swarm-some-initial-insights/976602
|
212 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# ask.py
|
2 |
|
3 |
[![License](https://img.shields.io/github/license/pengfeng/ask.py)](LICENSE)
|
|
|
5 |
A single Python program to implement the search-extract-summarize flow, similar to AI search
|
6 |
engines such as Perplexity.
|
7 |
|
8 |
+
- You can run it on command line or with a GradIO UI.
|
9 |
+
- You can control the output behavior, e.g., extract structured data or change output language,
|
10 |
+
- You can control the search behavior, e.g., restrict to a specific site or date, or just scrape
|
11 |
+
a specified list of URLs.
|
12 |
+
- You can run it in a cron job or bash script to automate complex search/data extraction tasks.
|
13 |
+
|
14 |
+
We have a running UI example [in HuggingFace Spaces](https://huggingface.co/spaces/leettools/AskPy).
|
15 |
+
|
16 |
+
![image](https://github.com/user-attachments/assets/0483e6a2-75d7-4fbd-813f-bfa13839c836)
|
17 |
+
|
18 |
+
> [!NOTE]
|
19 |
+
>
|
20 |
+
> - Our main goal is to illustrate the basic concepts of AI search engines with the raw constructs.
|
21 |
+
> Performance or scalability is not in the scope of this program.
|
22 |
+
> - We are planning to open source a real search-enabled AI toolset with real DB setup, real document
|
23 |
+
> pipeline, and real query engine soon. Star and watch this repo for updates!
|
24 |
+
|
25 |
> [UPDATE]
|
26 |
>
|
27 |
+
> - 2024-11-10: add Chonkie as the default chunker
|
28 |
+
> - 2024-10-28: add extract function as a new output mode
|
29 |
+
> - 2024-10-25: add hybrid search demo using DuckDB full-text search
|
30 |
> - 2024-10-22: add GradIO integation
|
31 |
> - 2024-10-21: use DuckDB for the vector search and use API for embedding
|
32 |
> - 2024-10-20: allow to specify a list of input urls
|
33 |
> - 2024-10-18: output-language and output-length parameters for LLM
|
34 |
> - 2024-10-18: date-restrict and target-site parameters for seach
|
35 |
|
|
|
|
|
|
|
|
|
36 |
## The search-extract-summarize flow
|
37 |
|
38 |
Given a query, the program will
|
|
|
41 |
- crawl and scape the pages for their text content
|
42 |
- chunk the text content into chunks and save them into a vectordb
|
43 |
- perform a vector search with the query and find the top 10 matched chunks
|
44 |
+
- [Optional] search using full-text search and combine the results with the vector search
|
45 |
+
- [Optional] use a reranker to re-rank the top chunks
|
46 |
+
- use the top chunks as the context to ask an LLM to generate the answer
|
47 |
- output the answer with the references
|
48 |
|
49 |
Of course this flow is a very simplified version of the real AI search engines, but it is a good
|
|
|
62 |
## Quick start
|
63 |
|
64 |
```bash
|
65 |
+
# recommend to use Python 3.10 or later and use venv or conda to create a virtual environment
|
66 |
+
% pip install -r requirements.txt
|
67 |
|
68 |
# modify .env file to set the API keys or export them as environment variables as below
|
69 |
|
70 |
# right now we use Google search API
|
71 |
+
% export SEARCH_API_KEY="your-google-search-api-key"
|
72 |
+
% export SEARCH_PROJECT_KEY="your-google-cx-key"
|
73 |
|
74 |
# right now we use OpenAI API
|
75 |
+
% export LLM_API_KEY="your-openai-api-key"
|
76 |
|
77 |
+
# By default, the program will start a web UI. See GradIO Deployment section for more info.
|
78 |
+
# Run the program on command line with -c option
|
79 |
+
% python ask.py -c -q "What is an LLM agent?"
|
80 |
|
81 |
# we can specify more parameters to control the behavior such as date_restrict and target_site
|
82 |
+
% python ask.py --help
|
83 |
Usage: ask.py [OPTIONS]
|
84 |
|
85 |
+
Search web for the query and summarize the results.
|
86 |
|
87 |
Options:
|
|
|
88 |
-q, --query TEXT Query to search
|
89 |
+
-o, --output-mode [answer|extract]
|
90 |
+
Output mode for the answer, default is a
|
91 |
+
simple answer
|
92 |
-d, --date-restrict INTEGER Restrict search results to a specific date
|
93 |
range, default is no restriction
|
94 |
-s, --target-site TEXT Restrict search results to a specific site,
|
|
|
98 |
--url-list-file TEXT Instead of doing web search, scrape the
|
99 |
target URL list and answer the query based
|
100 |
on the content
|
101 |
+
--extract-schema-file TEXT Pydantic schema for the extract mode
|
102 |
+
--inference-model-name TEXT Model name to use for inference
|
103 |
+
--hybrid-search Use hybrid search mode with both vector
|
104 |
+
search and full-text search
|
105 |
+
-c, --run-cli Run as a command line tool instead of
|
106 |
+
launching the Gradio UI
|
107 |
-l, --log-level [DEBUG|INFO|WARNING|ERROR]
|
108 |
Set the logging level [default: INFO]
|
109 |
--help Show this message and exit.
|
|
|
116 |
- [Jinja2](https://jinja.palletsprojects.com/en/3.0.x/)
|
117 |
- [bs4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
|
118 |
- [DuckDB](https://github.com/duckdb/duckdb)
|
119 |
+
- [GradIO](https://github.com/gradio-app/gradio)
|
120 |
+
- [Chonkie](https://github.com/bhavnicksm/chonkie)
|
121 |
|
122 |
+
## GradIO Deployment
|
123 |
|
124 |
+
> [!NOTE]
|
125 |
+
> Original GradIO app-sharing document [here](https://www.gradio.app/guides/sharing-your-app).
|
126 |
|
127 |
+
### Quick test and sharing
|
128 |
|
129 |
+
By default, the program will start a web UI and share through GradIO.
|
130 |
|
131 |
+
```bash
|
132 |
+
% python ask.py
|
133 |
+
* Running on local URL: http://127.0.0.1:7860
|
134 |
+
* Running on public URL: https://77c277af0330326587.gradio.live
|
135 |
+
|
136 |
+
# you can also specify SHARE_GRADIO_UI to only run locally
|
137 |
+
% export SHARE_GRADIO_UI=False
|
138 |
+
% python ask.py
|
139 |
+
* Running on local URL: http://127.0.0.1:7860
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
```
|
141 |
|
142 |
+
### To share a more permanent link using HuggingFace Spaces
|
143 |
|
144 |
+
- First, you need to [create a free HuggingFace account](https://huggingface.co/welcome).
|
145 |
+
- Then in your [settings/token page](https://huggingface.co/settings/tokens), create a new token with Write permissions.
|
146 |
+
- In your terminal, run the following commands in you app directory to deploy your program to
|
147 |
+
HuggingFace Spaces:
|
148 |
|
149 |
+
```bash
|
150 |
+
% pip install gradio
|
151 |
+
% gradio deploy
|
152 |
+
Creating new Spaces Repo in '/home/you/ask.py'. Collecting metadata, press Enter to accept default value.
|
153 |
+
Enter Spaces app title [ask.py]: ask.py
|
154 |
+
Enter Gradio app file [ask.py]:
|
155 |
+
Enter Spaces hardware (cpu-basic, cpu-upgrade, t4-small, t4-medium, l4x1, l4x4, zero-a10g, a10g-small, a10g-large, a10g-largex2, a10g-largex4, a100-large, v5e-1x1, v5e-2x2, v5e-2x4) [cpu-basic]:
|
156 |
+
Any Spaces secrets (y/n) [n]: y
|
157 |
+
Enter secret name (leave blank to end): SEARCH_API_KEY
|
158 |
+
Enter secret value for SEARCH_API_KEY: YOUR_SEARCH_API_KEY
|
159 |
+
Enter secret name (leave blank to end): SEARCH_PROJECT_KEY
|
160 |
+
Enter secret value for SEARCH_API_KEY: YOUR_SEARCH_PROJECT_KEY
|
161 |
+
Enter secret name (leave blank to end): LLM_API_KEY
|
162 |
+
Enter secret value for LLM_API_KEY: YOUR_LLM_API_KEY
|
163 |
+
Enter secret name (leave blank to end):
|
164 |
+
Create Github Action to automatically update Space on 'git push'? [n]: n
|
165 |
+
Space available at https://huggingface.co/spaces/your_user_name/ask.py
|
166 |
```
|
167 |
+
|
168 |
+
Now you can use the HuggingFace space app to run your queries.
|
169 |
+
|
170 |
+
## Use Cases
|
171 |
+
|
172 |
+
- [Search like Perplexity](demos/search_and_answer.md)
|
173 |
+
- [Only use the latest information from a specific site](demos/search_on_site_and_date.md)
|
174 |
+
- [Extract information from web search results](demos/search_and_extract.md)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
requirements.txt
CHANGED
@@ -6,4 +6,5 @@ bs4==0.0.2
|
|
6 |
lxml==4.8.0
|
7 |
python-dotenv==1.0.1
|
8 |
duckdb==1.1.2
|
9 |
-
gradio==5.3.0
|
|
|
|
6 |
lxml==4.8.0
|
7 |
python-dotenv==1.0.1
|
8 |
duckdb==1.1.2
|
9 |
+
gradio==5.3.0
|
10 |
+
chonkie==0.1.2
|