Aditya Patkar commited on
Commit
4fad4c6
·
1 Parent(s): 137df39

Added readme and comments

Browse files
Files changed (2) hide show
  1. README.md +77 -1
  2. app.py +103 -73
README.md CHANGED
@@ -9,4 +9,80 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ # People Data Labs Query Tool with Streamlit
13
+
14
+ ## Overview
15
+
16
+ This is a simple web application built with Streamlit that allows you to search for a person or company using the People Data Labs API. You can perform searches by providing the necessary parameters, such as your API key, the type of search (person or company), an SQL query, the dataset to search in, the number of results to return, and whether you want the results to be pretty printed.
17
+
18
+ ## Guidelines
19
+
20
+ To use the People Data Labs Query Tool, follow the guidelines below:
21
+
22
+ - **API Key**: You will need to obtain an API key from People Data Labs. If you don't have one, visit their website to get your API key.
23
+
24
+ - **Type**: Specify whether you want to search for a person or a company. You can select either option from the dropdown menu.
25
+
26
+ - **SQL Query**: Provide the SQL query you want to run. You can find this query on the dashboard provided by People Data Labs.
27
+
28
+ - **Dataset**: If you want to narrow down your search to a specific dataset, you can enter the dataset name. Leaving this field blank will search across all datasets.
29
+
30
+ - **Size**: Specify the number of results you want to return. The tool will return the specified number of results based on your query.
31
+
32
+ - **Pretty Print**: By default, the results are not pretty printed. If you prefer the results to be displayed in a more readable format, you can enable this option.
33
+
34
+ ## Installation and Local Setup
35
+
36
+ Follow the steps below to install and run the People Data Labs Query Tool locally:
37
+
38
+ 1. Clone the repository to your local machine:
39
+
40
+ ```bash
41
+ git clone https://github.com/your_username/your_repo.git
42
+ ```
43
+
44
+ 2. Navigate to the project directory:
45
+
46
+ ```bash
47
+ cd your_repo
48
+ ```
49
+
50
+ 3. Create a virtual environment (optional, but recommended):
51
+
52
+ ```bash
53
+ python3 -m venv venv
54
+ ```
55
+
56
+ 4. Activate the virtual environment:
57
+
58
+ ```bash
59
+ source venv/bin/activate
60
+ ```
61
+
62
+ 5. Install the required dependencies from `requirements.txt`:
63
+
64
+ ```bash
65
+ pip install -r requirements.txt
66
+ ```
67
+
68
+ ## Running the Application
69
+
70
+ Once you have completed the installation and local setup, you can run the People Data Labs Query Tool:
71
+
72
+ ```bash
73
+ streamlit run app.py
74
+ ```
75
+
76
+ After executing the command above, Streamlit will launch a local development server and display a URL (usually http://localhost:8501) where you can access the web application in your browser.
77
+
78
+ ## Note
79
+
80
+ Please remember to keep your API key secure and do not share it publicly or commit it to version control systems.
81
+
82
+ Happy searching with the People Data Labs Query Tool!
83
+
84
+ ## Resources
85
+
86
+ The app is built with [Streamlit](https://streamlit.io/), a Python library that makes it easy to build web applications for machine learning and data science.
87
+
88
+ The app is also deployed on huggingface spaces. You can access the app [here](https://huggingface.co/spaces/adityapatkar/PDL_QUERY).
app.py CHANGED
@@ -1,7 +1,12 @@
 
 
 
 
 
1
  import datetime as dt
2
  import pandas as pd
3
 
4
- import streamlit as st
5
  from peopledatalabs import PDLPY as pdl
6
 
7
 
@@ -9,6 +14,8 @@ def setup():
9
  """
10
  Streamlit related setup. This has to be run for each page.
11
  """
 
 
12
  hide_streamlit_style = """
13
 
14
  <style>
@@ -17,87 +24,110 @@ def setup():
17
  </style>
18
  """
19
  st.markdown(hide_streamlit_style, unsafe_allow_html=True)
20
-
 
21
  def main():
22
- '''
23
- Main function of the app.
24
- '''
 
25
  setup()
26
- st.title('People Data Labs Query Tool')
27
-
28
- st.subheader('Search for a Person or Company')
29
-
30
- st.write('''
31
- This tool allows you to search for a person or company using the People Data Labs API. \n
32
- *Guidelines*: \n
33
- - API Key: Your API key from People Data Labs \n
34
- - Type: Person or Company \n
35
- - SQL Query: The SQL query you want to run. Find it on the dashboard \n
36
- - Dataset: The dataset you want to search in. Leave blank for all \n
37
- - Size: The number of results you want to return \n
38
- - Pretty Print: Whether you want the results to be pretty printed. Default is False \n \n\n
39
- ''')
40
-
41
- #horizontal line
42
- st.markdown('<hr>', unsafe_allow_html=True)
43
- st.markdown('<br>', unsafe_allow_html=True)
44
-
45
- st.subheader('Search')
46
- form = st.form(key='my_form')
47
- api_key = form.text_input(label='API Key')
48
- type_search = form.selectbox(label='Type', options=['Person', 'Company'])
49
- sql_query = form.text_area(label='SQL Query')
 
 
 
 
 
 
50
  sql_query = sql_query.strip('"')
51
- dataset = form.text_input(label='Dataset')
52
- if dataset is None or dataset.strip() == '':
53
- dataset = 'all'
54
- size = form.number_input(label='Size', min_value=1, max_value=1000, value=10)
55
- pretty = form.checkbox(label='Pretty Print')
56
- submit_button = form.form_submit_button(label='Submit')
57
-
 
 
 
 
58
  if submit_button:
59
- if api_key is None or api_key.strip() == '':
60
- st.error('API Key is required!')
61
- elif sql_query is None or sql_query.strip() == '':
62
- st.error('SQL Query is required!')
 
63
  else:
64
- PARAMS = {'sql': sql_query, 'dataset': dataset, 'size': size, 'pretty': pretty}
65
-
 
 
 
 
 
 
66
  client = pdl(api_key=api_key)
67
-
68
- if type_search == 'Person':
69
- response = client.person.search(**PARAMS).json()
70
-
71
- elif type_search == 'Company':
72
- response = client.company.search(**PARAMS).json()
73
-
 
 
74
  if response["status"] == 200:
75
  st.success(f"Found {response['total']} records for this search.")
76
- data = response['data']
77
-
78
- #convert json to csv. Json is list of dicts
79
- df = pd.DataFrame(data)
80
- csv = df.to_csv(index=False)
81
-
82
- #show data
83
- st.markdown('<hr>', unsafe_allow_html=True)
84
- st.markdown('<br>', unsafe_allow_html=True)
85
- st.subheader('Data')
86
- st.download_button(label='Download CSV', data=csv, file_name=f'{type_search}-data-{dt.datetime.now().strftime("%Y-%m-%d %H:%M:%S")}.csv')
87
- st.dataframe(df)
 
 
 
 
 
 
 
 
 
88
  else:
89
- st.error('Error retrieving data!')
90
  st.error("Error:")
91
- st.write(response['error'])
92
-
93
- if __name__ == '__main__':
94
- main()
95
-
96
-
97
-
98
-
99
 
100
 
101
-
102
-
103
-
 
1
+ """
2
+ This is the main app file. It contains the main function of the app.
3
+ """
4
+
5
+ # imports
6
  import datetime as dt
7
  import pandas as pd
8
 
9
+ import streamlit as st
10
  from peopledatalabs import PDLPY as pdl
11
 
12
 
 
14
  """
15
  Streamlit related setup. This has to be run for each page.
16
  """
17
+
18
+ # hide hamburger menu
19
  hide_streamlit_style = """
20
 
21
  <style>
 
24
  </style>
25
  """
26
  st.markdown(hide_streamlit_style, unsafe_allow_html=True)
27
+
28
+
29
  def main():
30
+ """
31
+ Main function of the app.
32
+ """
33
+
34
  setup()
35
+
36
+ # title, subheader, and description
37
+ st.title("People Data Labs Query Tool")
38
+
39
+ st.subheader("Search for a Person or Company")
40
+
41
+ st.write(
42
+ """
43
+ This tool allows you to search for a person or company using the People Data Labs API. \n
44
+ *Guidelines*: \n
45
+ - API Key: Your API key from People Data Labs \n
46
+ - Type: Person or Company \n
47
+ - SQL Query: The SQL query you want to run. Find it on the dashboard \n
48
+ - Dataset: The dataset you want to search in. Leave blank for all \n
49
+ - Size: The number of results you want to return \n
50
+ - Pretty Print: Whether you want the results to be pretty printed.
51
+ Default is False \n \n\n
52
+ """
53
+ )
54
+
55
+ # horizontal line and line break
56
+ st.markdown("<hr>", unsafe_allow_html=True)
57
+ st.markdown("<br>", unsafe_allow_html=True)
58
+
59
+ # search form, this is a streamlit form
60
+ st.subheader("Search")
61
+ form = st.form(key="my_form")
62
+ api_key = form.text_input(label="API Key")
63
+ type_search = form.selectbox(label="Type", options=["Person", "Company"])
64
+ sql_query = form.text_area(label="SQL Query")
65
  sql_query = sql_query.strip('"')
66
+ dataset = form.text_input(label="Dataset")
67
+
68
+ # if dataset is empty, set to all
69
+ if dataset is None or dataset.strip() == "":
70
+ dataset = "all"
71
+
72
+ size = form.number_input(label="Size", min_value=1, max_value=1000, value=10)
73
+ pretty = form.checkbox(label="Pretty Print")
74
+ submit_button = form.form_submit_button(label="Submit")
75
+
76
+ # if submit button is clicked
77
  if submit_button:
78
+ # check if api key and sql query are not empty
79
+ if api_key is None or api_key.strip() == "":
80
+ st.error("API Key is required!")
81
+ elif sql_query is None or sql_query.strip() == "":
82
+ st.error("SQL Query is required!")
83
  else:
84
+ # if all is good, run the query
85
+ params = {
86
+ "sql": sql_query,
87
+ "dataset": dataset,
88
+ "size": size,
89
+ "pretty": pretty,
90
+ }
91
+
92
  client = pdl(api_key=api_key)
93
+
94
+ # select the client based on the type of search
95
+ if type_search == "Person":
96
+ response = client.person.search(**params).json()
97
+
98
+ elif type_search == "Company":
99
+ response = client.company.search(**params).json()
100
+
101
+ # if status is 200, show the data
102
  if response["status"] == 200:
103
  st.success(f"Found {response['total']} records for this search.")
104
+ data = response["data"]
105
+
106
+ # convert json to csv. Json is list of dicts
107
+ data_frame = pd.DataFrame(data)
108
+ csv = data_frame.to_csv(index=False)
109
+
110
+ # show data
111
+ st.markdown("<hr>", unsafe_allow_html=True)
112
+ st.markdown("<br>", unsafe_allow_html=True)
113
+ st.subheader("Data")
114
+ now = dt.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
115
+
116
+ # download button
117
+ st.download_button(
118
+ label="Download CSV",
119
+ data=csv,
120
+ file_name=f"{type_search}-data-{now}.csv",
121
+ )
122
+ st.dataframe(data_frame)
123
+
124
+ # handle errors
125
  else:
126
+ st.error("Error retrieving data!")
127
  st.error("Error:")
128
+ st.write(response["error"])
 
 
 
 
 
 
 
129
 
130
 
131
+ # run the app
132
+ if __name__ == "__main__":
133
+ main()