Spaces:

LeetTools
/

AskPy

Running

App Files Files Community

LeetTools commited on Nov 11, 2024

Commit

7b647a2

verified ·

1 Parent(s): 6eabcb6

Upload 2 files

Browse files

Files changed (2) hide show

README.md +88 -126
requirements.txt +2 -1

README.md CHANGED Viewed

@@ -1,9 +1,3 @@
----
-title: ask.py
-app_file: ask.py
-sdk: gradio
-sdk_version: 5.3.0
----
 # ask.py
 [![License](https://img.shields.io/github/license/pengfeng/ask.py)](LICENSE)
@@ -11,18 +5,34 @@ sdk_version: 5.3.0
 A single Python program to implement the search-extract-summarize flow, similar to AI search
 engines such as Perplexity.
 > [UPDATE]
 >
 > - 2024-10-22: add GradIO integation
 > - 2024-10-21: use DuckDB for the vector search and use API for embedding
 > - 2024-10-20: allow to specify a list of input urls
 > - 2024-10-18: output-language and output-length parameters for LLM
 > - 2024-10-18: date-restrict and target-site parameters for seach
-> [!NOTE]
-> Our main goal is to illustrate the basic concepts of AI search engines with the raw constructs.
-> Performance or scalability is not in the scope of this program.
 ## The search-extract-summarize flow
 Given a query, the program will
@@ -31,7 +41,9 @@ Given a query, the program will
 - crawl and scape the pages for their text content
 - chunk the text content into chunks and save them into a vectordb
 - perform a vector search with the query and find the top 10 matched chunks
-- use the top 10 chunks as the context to ask an LLM to generate the answer
 - output the answer with the references
 Of course this flow is a very simplified version of the real AI search engines, but it is a good
@@ -50,30 +62,33 @@ For example, we can:
 ## Quick start
 ```bash
-pip install -r requirements.txt
 # modify .env file to set the API keys or export them as environment variables as below
 # right now we use Google search API
-export SEARCH_API_KEY="your-google-search-api-key"
-export SEARCH_PROJECT_KEY="your-google-cx-key"
 # right now we use OpenAI API
-export LLM_API_KEY="your-openai-api-key"
-# run the program
-python ask.py -q "What is an LLM agent?"
 # we can specify more parameters to control the behavior such as date_restrict and target_site
-python ask.py --help
 Usage: ask.py [OPTIONS]
-  Search web for the query and summarize the results
 Options:
-  --web-ui                        Launch the web interface
   -q, --query TEXT                Query to search
   -d, --date-restrict INTEGER     Restrict search results to a specific date
                                   range, default is no restriction
   -s, --target-site TEXT          Restrict search results to a specific site,
@@ -83,7 +98,12 @@ Options:
   --url-list-file TEXT            Instead of doing web search, scrape the
                                   target URL list and answer the query based
                                   on the content
-  -m, --model-name TEXT           Model name to use for inference
   -l, --log-level [DEBUG|INFO|WARNING|ERROR]
                                   Set the logging level  [default: INFO]
   --help                          Show this message and exit.
@@ -96,117 +116,59 @@ Options:
 - [Jinja2](https://jinja.palletsprojects.com/en/3.0.x/)
 - [bs4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
 - [DuckDB](https://github.com/duckdb/duckdb)
-- [GradIO](https://grad.io)
-## Screenshot for the GradIO integration
-![image](https://github.com/user-attachments/assets/0483e6a2-75d7-4fbd-813f-bfa13839c836)
-## Sample output
-### General Search
-```
-% python ask.py -q "Why do we need agentic RAG even if we have ChatGPT?"
-✅ Found 10 links for query: Why do we need agentic RAG even if we have ChatGPT?
-✅ Scraping the URLs ...
-✅ Scraped 10 URLs ...
-✅ Chunking the text ...
-✅ Saving to vector DB ...
-✅ Querying the vector DB ...
-✅ Running inference with context ...
-# Answer
-Agentic RAG (Retrieval-Augmented Generation) is needed alongside ChatGPT for several reasons:
-1. **Precision and Contextual Relevance**: While ChatGPT offers generative responses, it may not
-reliably provide precise answers, especially when specific, accurate information is critical[5].
-Agentic RAG enhances this by integrating retrieval mechanisms that improve response context and
-accuracy, allowing users to access the most relevant and recent data without the need for costly
-model fine-tuning[2].
-2. **Customizability**: RAG allows businesses to create tailored chatbots that can securely
-reference company-specific data[2]. In contrast, ChatGPT’s broader capabilities may not be
-directly suited for specialized, domain-specific questions without comprehensive customization[3].
-3. **Complex Query Handling**: RAG can be optimized for complex queries and can be adjusted to
-work better with specific types of inputs, such as comparing and contrasting information, a task
-where ChatGPT may struggle under certain circumstances[9]. This level of customization can lead to
-better performance in niche applications where precise retrieval of information is crucial.
-4. **Asynchronous Processing Capabilities**: Future agentic systems aim to integrate asynchronous
-handling of actions, allowing for parallel processing and reducing wait times for retrieval and
-computation, which is a limitation in the current form of ChatGPT[7]. This advancement would enhance
-overall efficiency and responsiveness in conversations.
-5. **Incorporating Retrieved Information Effectively**: Using RAG can significantly improve how
-retrieved information is utilized within a conversation. By effectively managing the context and
-relevance of retrieved documents, RAG helps in framing prompts that can guide ChatGPT towards
-delivering more accurate responses[10].
-In summary, while ChatGPT excels in generating conversational responses, agentic RAG brings
-precision, customization, and efficiency that can significantly enhance the overall conversational
-AI experience.
-# References
-[1] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
-[2] https://www.linkedin.com/posts/brianjuliusdc_dax-powerbi-chatgpt-activity-7235953280177041408-wQqq
-[3] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
-[4] https://community.openai.com/t/prompt-engineering-for-rag/621495
-[5] https://www.ben-evans.com/benedictevans/2024/6/8/building-ai-products
-[6] https://community.openai.com/t/prompt-engineering-for-rag/621495
-[7] https://www.linkedin.com/posts/kurtcagle_agentic-rag-personalizing-and-optimizing-activity-7198097129993613312-z7Sm
-[8] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
-[9] https://community.openai.com/t/how-to-use-rag-properly-and-what-types-of-query-it-is-good-at/658204
-[10] https://community.openai.com/t/prompt-engineering-for-rag/621495
 ```
-### Only use the latest information from a specific site
-This following query will only use the information from openai.com that are updated in the previous
-day. The behavior is similar to the "site:openai.com" and "date-restrict" search parameters in Google
-search.
 ```
-% python ask.py -q "OpenAI Swarm Framework" -d 1 -s openai.com
-✅ Found 10 links for query: OpenAI Swarm Framework
-✅ Scraping the URLs ...
-✅ Scraped 10 URLs ...
-✅ Chunking the text ...
-✅ Saving to vector DB ...
-✅ Querying the vector DB to get context ...
-✅ Running inference with context ...
-# Answer
-OpenAI Swarm Framework is an experimental platform designed for building, orchestrating, and
-deploying multi-agent systems, enabling multiple AI agents to collaborate on complex tasks. It contrasts
-with traditional single-agent models by facilitating agent interaction and coordination, thus enhancing
-efficiency[5][9]. The framework provides developers with a way to orchestrate these agent systems in
-a lightweight manner, leveraging Node.js for scalable applications[1][4].
-One implementation of this framework is Swarm.js, which serves as a Node.js SDK, allowing users to
-create and manage agents that perform tasks and hand off conversations. Swarm.js is positioned as
-an educational tool, making it accessible for both beginners and experts, although it may still contain
-bugs and is currently lightweight[1][3][7]. This new approach emphasizes multi-agent collaboration and is
-well-suited for back-end development, requiring some programming expertise for effective implementation[9].
-Overall, OpenAI Swarm facilitates a shift in how AI systems can collaborate, differing from existing
-OpenAI tools by focusing on backend orchestration rather than user-interactive front-end applications[9].
-# References
-[1] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
-[2] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
-[3] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
-[4] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
-[5] https://community.openai.com/t/swarm-some-initial-insights/976602
-[6] https://community.openai.com/t/swarm-some-initial-insights/976602
-[7] https://community.openai.com/t/introducing-swarm-js-node-js-implementation-of-openai-swarm/977510
-[8] https://community.openai.com/t/introducing-swarm-js-a-node-js-implementation-of-openai-swarm/977510
-[9] https://community.openai.com/t/swarm-some-initial-insights/976602
-[10] https://community.openai.com/t/swarm-some-initial-insights/976602
-```

 # ask.py
 [![License](https://img.shields.io/github/license/pengfeng/ask.py)](LICENSE)
 A single Python program to implement the search-extract-summarize flow, similar to AI search
 engines such as Perplexity.
+- You can run it on command line or with a GradIO UI.
+- You can control the output behavior, e.g., extract structured data or change output language,
+- You can control the search behavior, e.g., restrict to a specific site or date, or just scrape
+  a specified list of URLs.
+- You can run it in a cron job or bash script to automate complex search/data extraction tasks.
+We have a running UI example [in HuggingFace Spaces](https://huggingface.co/spaces/leettools/AskPy).
+![image](https://github.com/user-attachments/assets/0483e6a2-75d7-4fbd-813f-bfa13839c836)
+> [!NOTE]
+>
+> - Our main goal is to illustrate the basic concepts of AI search engines with the raw constructs.
+>   Performance or scalability is not in the scope of this program.
+> - We are planning to open source a real search-enabled AI toolset with real DB setup, real document
+>   pipeline, and real query engine soon. Star and watch this repo for updates!
 > [UPDATE]
 >
+> - 2024-11-10: add Chonkie as the default chunker
+> - 2024-10-28: add extract function as a new output mode
+> - 2024-10-25: add hybrid search demo using DuckDB full-text search
 > - 2024-10-22: add GradIO integation
 > - 2024-10-21: use DuckDB for the vector search and use API for embedding
 > - 2024-10-20: allow to specify a list of input urls
 > - 2024-10-18: output-language and output-length parameters for LLM
 > - 2024-10-18: date-restrict and target-site parameters for seach
 ## The search-extract-summarize flow
 Given a query, the program will
 - crawl and scape the pages for their text content
 - chunk the text content into chunks and save them into a vectordb
 - perform a vector search with the query and find the top 10 matched chunks
+- [Optional] search using full-text search and combine the results with the vector search
+- [Optional] use a reranker to re-rank the top chunks
+- use the top chunks as the context to ask an LLM to generate the answer
 - output the answer with the references
 Of course this flow is a very simplified version of the real AI search engines, but it is a good
 ## Quick start
 ```bash
+# recommend to use Python 3.10 or later and use venv or conda to create a virtual environment
+% pip install -r requirements.txt
 # modify .env file to set the API keys or export them as environment variables as below
 # right now we use Google search API
+% export SEARCH_API_KEY="your-google-search-api-key"
+% export SEARCH_PROJECT_KEY="your-google-cx-key"
 # right now we use OpenAI API
+% export LLM_API_KEY="your-openai-api-key"
+# By default, the program will start a web UI. See GradIO Deployment section for more info.
+# Run the program on command line with -c option
+% python ask.py -c -q "What is an LLM agent?"
 # we can specify more parameters to control the behavior such as date_restrict and target_site
+% python ask.py --help
 Usage: ask.py [OPTIONS]
+  Search web for the query and summarize the results.
 Options:
   -q, --query TEXT                Query to search
+  -o, --output-mode [answer|extract]
+                                  Output mode for the answer, default is a
+                                  simple answer
   -d, --date-restrict INTEGER     Restrict search results to a specific date
                                   range, default is no restriction
   -s, --target-site TEXT          Restrict search results to a specific site,
   --url-list-file TEXT            Instead of doing web search, scrape the
                                   target URL list and answer the query based
                                   on the content
+  --extract-schema-file TEXT      Pydantic schema for the extract mode
+  --inference-model-name TEXT     Model name to use for inference
+  --hybrid-search                 Use hybrid search mode with both vector
+                                  search and full-text search
+  -c, --run-cli                   Run as a command line tool instead of
+                                  launching the Gradio UI
   -l, --log-level [DEBUG|INFO|WARNING|ERROR]
                                   Set the logging level  [default: INFO]
   --help                          Show this message and exit.
 - [Jinja2](https://jinja.palletsprojects.com/en/3.0.x/)
 - [bs4](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
 - [DuckDB](https://github.com/duckdb/duckdb)
+- [GradIO](https://github.com/gradio-app/gradio)
+- [Chonkie](https://github.com/bhavnicksm/chonkie)
+## GradIO Deployment
+> [!NOTE]
+> Original GradIO app-sharing document [here](https://www.gradio.app/guides/sharing-your-app).
+### Quick test and sharing
+By default, the program will start a web UI and share through GradIO.
+```bash
+% python ask.py
+* Running on local URL:  http://127.0.0.1:7860
+* Running on public URL: https://77c277af0330326587.gradio.live
+# you can also specify SHARE_GRADIO_UI to only run locally
+% export SHARE_GRADIO_UI=False
+% python ask.py
+* Running on local URL:  http://127.0.0.1:7860
 ```
+### To share a more permanent link using HuggingFace Spaces
+- First, you need to [create a free HuggingFace account](https://huggingface.co/welcome).
+- Then in your [settings/token page](https://huggingface.co/settings/tokens), create a new token with Write permissions.
+- In your terminal, run the following commands in you app directory to deploy your program to
+  HuggingFace Spaces:
+```bash
+% pip install gradio
+% gradio deploy
+Creating new Spaces Repo in '/home/you/ask.py'. Collecting metadata, press Enter to accept default value.
+Enter Spaces app title [ask.py]: ask.py
+Enter Gradio app file [ask.py]:
+Enter Spaces hardware (cpu-basic, cpu-upgrade, t4-small, t4-medium, l4x1, l4x4, zero-a10g, a10g-small, a10g-large, a10g-largex2, a10g-largex4, a100-large, v5e-1x1, v5e-2x2, v5e-2x4) [cpu-basic]:
+Any Spaces secrets (y/n) [n]: y
+Enter secret name (leave blank to end): SEARCH_API_KEY
+Enter secret value for SEARCH_API_KEY: YOUR_SEARCH_API_KEY
+Enter secret name (leave blank to end): SEARCH_PROJECT_KEY
+Enter secret value for SEARCH_API_KEY: YOUR_SEARCH_PROJECT_KEY
+Enter secret name (leave blank to end): LLM_API_KEY
+Enter secret value for LLM_API_KEY: YOUR_LLM_API_KEY
+Enter secret name (leave blank to end):
+Create Github Action to automatically update Space on 'git push'? [n]: n
+Space available at https://huggingface.co/spaces/your_user_name/ask.py
 ```
+Now you can use the HuggingFace space app to run your queries.
+## Use Cases
+- [Search like Perplexity](demos/search_and_answer.md)
+- [Only use the latest information from a specific site](demos/search_on_site_and_date.md)
+- [Extract information from web search results](demos/search_and_extract.md)

requirements.txt CHANGED Viewed

@@ -6,4 +6,5 @@ bs4==0.0.2
 lxml==4.8.0
 python-dotenv==1.0.1
 duckdb==1.1.2
-gradio==5.3.0

 lxml==4.8.0
 python-dotenv==1.0.1
 duckdb==1.1.2
+gradio==5.3.0
+chonkie==0.1.2