Spaces:

adityapatkar
/

PDL_QUERY

Sleeping

App Files Files Community

Aditya Patkar commited on Jul 21, 2023

Commit

4fad4c6

1 Parent(s): 137df39

Added readme and comments

Browse files

Files changed (2) hide show

README.md +77 -1
app.py +103 -73

README.md CHANGED Viewed

@@ -9,4 +9,80 @@ app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 pinned: false
 ---
+# People Data Labs Query Tool with Streamlit
+## Overview
+This is a simple web application built with Streamlit that allows you to search for a person or company using the People Data Labs API. You can perform searches by providing the necessary parameters, such as your API key, the type of search (person or company), an SQL query, the dataset to search in, the number of results to return, and whether you want the results to be pretty printed.
+## Guidelines
+To use the People Data Labs Query Tool, follow the guidelines below:
+- **API Key**: You will need to obtain an API key from People Data Labs. If you don't have one, visit their website to get your API key.
+- **Type**: Specify whether you want to search for a person or a company. You can select either option from the dropdown menu.
+- **SQL Query**: Provide the SQL query you want to run. You can find this query on the dashboard provided by People Data Labs.
+- **Dataset**: If you want to narrow down your search to a specific dataset, you can enter the dataset name. Leaving this field blank will search across all datasets.
+- **Size**: Specify the number of results you want to return. The tool will return the specified number of results based on your query.
+- **Pretty Print**: By default, the results are not pretty printed. If you prefer the results to be displayed in a more readable format, you can enable this option.
+## Installation and Local Setup
+Follow the steps below to install and run the People Data Labs Query Tool locally:
+1. Clone the repository to your local machine:
+```bash
+git clone https://github.com/your_username/your_repo.git
+```
+2. Navigate to the project directory:
+```bash
+cd your_repo
+```
+3. Create a virtual environment (optional, but recommended):
+```bash
+python3 -m venv venv
+```
+4. Activate the virtual environment:
+```bash
+source venv/bin/activate
+```
+5. Install the required dependencies from `requirements.txt`:
+```bash
+pip install -r requirements.txt
+```
+## Running the Application
+Once you have completed the installation and local setup, you can run the People Data Labs Query Tool:
+```bash
+streamlit run app.py
+```
+After executing the command above, Streamlit will launch a local development server and display a URL (usually http://localhost:8501) where you can access the web application in your browser.
+## Note
+Please remember to keep your API key secure and do not share it publicly or commit it to version control systems.
+Happy searching with the People Data Labs Query Tool!
+## Resources
+The app is built with [Streamlit](https://streamlit.io/), a Python library that makes it easy to build web applications for machine learning and data science.
+The app is also deployed on huggingface spaces. You can access the app [here](https://huggingface.co/spaces/adityapatkar/PDL_QUERY).

app.py CHANGED Viewed

@@ -1,7 +1,12 @@
 import datetime as dt
 import pandas as pd
-import streamlit as st
 from peopledatalabs import PDLPY as pdl
@@ -9,6 +14,8 @@ def setup():
     """
     Streamlit related setup. This has to be run for each page.
     """
     hide_streamlit_style = """
     <style>
@@ -17,87 +24,110 @@ def setup():
     </style>
     """
     st.markdown(hide_streamlit_style, unsafe_allow_html=True)
 def main():
-    '''
-     Main function of the app.
-    '''
     setup()
-    st.title('People Data Labs Query Tool')
-    st.subheader('Search for a Person or Company')
-    st.write('''
-                This tool allows you to search for a person or company using the People Data Labs API. \n
-                *Guidelines*: \n
-                    - API Key: Your API key from People Data Labs \n
-                    - Type: Person or Company \n
-                    - SQL Query: The SQL query you want to run. Find it on the dashboard \n
-                    - Dataset: The dataset you want to search in. Leave blank for all   \n
-                    - Size: The number of results you want to return \n
-                    - Pretty Print: Whether you want the results to be pretty printed. Default is False \n \n\n
-            ''')
-    #horizontal line
-    st.markdown('<hr>', unsafe_allow_html=True)
-    st.markdown('<br>', unsafe_allow_html=True)
-    st.subheader('Search')
-    form = st.form(key='my_form')
-    api_key = form.text_input(label='API Key')
-    type_search = form.selectbox(label='Type', options=['Person', 'Company'])
-    sql_query = form.text_area(label='SQL Query')
     sql_query = sql_query.strip('"')
-    dataset = form.text_input(label='Dataset')
-    if dataset is None or dataset.strip() == '':
-        dataset = 'all'
-    size = form.number_input(label='Size', min_value=1, max_value=1000, value=10)
-    pretty = form.checkbox(label='Pretty Print')
-    submit_button = form.form_submit_button(label='Submit')
     if submit_button:
-        if api_key is None or api_key.strip() == '':
-            st.error('API Key is required!')
-        elif sql_query is None or sql_query.strip() == '':
-            st.error('SQL Query is required!')
         else:
-            PARAMS = {'sql': sql_query, 'dataset': dataset, 'size': size, 'pretty': pretty}
             client = pdl(api_key=api_key)
-            if type_search == 'Person':
-                response = client.person.search(**PARAMS).json()
-            elif type_search == 'Company':
-                response = client.company.search(**PARAMS).json()
             if response["status"] == 200:
                 st.success(f"Found {response['total']} records for this search.")
-                data = response['data']
-                #convert json to csv. Json is list of dicts
-                df = pd.DataFrame(data)
-                csv = df.to_csv(index=False)
-                #show data
-                st.markdown('<hr>', unsafe_allow_html=True)
-                st.markdown('<br>', unsafe_allow_html=True)
-                st.subheader('Data')
-                st.download_button(label='Download CSV', data=csv, file_name=f'{type_search}-data-{dt.datetime.now().strftime("%Y-%m-%d %H:%M:%S")}.csv')
-                st.dataframe(df)
             else:
-                st.error('Error retrieving data!')
                 st.error("Error:")
-                st.write(response['error'])
-if __name__ == '__main__':
-    main()

+"""
+This is the main app file. It contains the main function of the app.
+"""
+# imports
 import datetime as dt
 import pandas as pd
+import streamlit as st
 from peopledatalabs import PDLPY as pdl
     """
     Streamlit related setup. This has to be run for each page.
     """
+    # hide hamburger menu
     hide_streamlit_style = """
     <style>
     </style>
     """
     st.markdown(hide_streamlit_style, unsafe_allow_html=True)
 def main():
+    """
+    Main function of the app.
+    """
     setup()
+    # title, subheader, and description
+    st.title("People Data Labs Query Tool")
+    st.subheader("Search for a Person or Company")
+    st.write(
+        """
+        This tool allows you to search for a person or company using the People Data Labs API. \n
+        *Guidelines*: \n
+            - API Key: Your API key from People Data Labs \n
+            - Type: Person or Company \n
+            - SQL Query: The SQL query you want to run. Find it on the dashboard \n
+            - Dataset: The dataset you want to search in. Leave blank for all   \n
+            - Size: The number of results you want to return \n
+            - Pretty Print: Whether you want the results to be pretty printed.
+                Default is False \n \n\n
+        """
+    )
+    # horizontal line and line break
+    st.markdown("<hr>", unsafe_allow_html=True)
+    st.markdown("<br>", unsafe_allow_html=True)
+    # search form, this is a streamlit form
+    st.subheader("Search")
+    form = st.form(key="my_form")
+    api_key = form.text_input(label="API Key")
+    type_search = form.selectbox(label="Type", options=["Person", "Company"])
+    sql_query = form.text_area(label="SQL Query")
     sql_query = sql_query.strip('"')
+    dataset = form.text_input(label="Dataset")
+    # if dataset is empty, set to all
+    if dataset is None or dataset.strip() == "":
+        dataset = "all"
+    size = form.number_input(label="Size", min_value=1, max_value=1000, value=10)
+    pretty = form.checkbox(label="Pretty Print")
+    submit_button = form.form_submit_button(label="Submit")
+    # if submit button is clicked
     if submit_button:
+        # check if api key and sql query are not empty
+        if api_key is None or api_key.strip() == "":
+            st.error("API Key is required!")
+        elif sql_query is None or sql_query.strip() == "":
+            st.error("SQL Query is required!")
         else:
+            # if all is good, run the query
+            params = {
+                "sql": sql_query,
+                "dataset": dataset,
+                "size": size,
+                "pretty": pretty,
+            }
             client = pdl(api_key=api_key)
+            # select the client based on the type of search
+            if type_search == "Person":
+                response = client.person.search(**params).json()
+            elif type_search == "Company":
+                response = client.company.search(**params).json()
+            # if status is 200, show the data
             if response["status"] == 200:
                 st.success(f"Found {response['total']} records for this search.")
+                data = response["data"]
+                # convert json to csv. Json is list of dicts
+                data_frame = pd.DataFrame(data)
+                csv = data_frame.to_csv(index=False)
+                # show data
+                st.markdown("<hr>", unsafe_allow_html=True)
+                st.markdown("<br>", unsafe_allow_html=True)
+                st.subheader("Data")
+                now = dt.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+                # download button
+                st.download_button(
+                    label="Download CSV",
+                    data=csv,
+                    file_name=f"{type_search}-data-{now}.csv",
+                )
+                st.dataframe(data_frame)
+            # handle errors
             else:
+                st.error("Error retrieving data!")
                 st.error("Error:")
+                st.write(response["error"])
+# run the app
+if __name__ == "__main__":
+    main()