Spaces:

mindmime
/

gradio

Runtime error

App Files Files Community

gradio / guides /09_other-tutorials /running-background-tasks.md

mindmime

Upload folder using huggingface_hub

a03b3ba verified almost 2 years ago

preview code

raw

history blame contribute delete

6.79 kB

	# Running Background Tasks

	Related spaces: https://huggingface.co/spaces/freddyaboulton/gradio-google-forms
	Tags: TASKS, SCHEDULED, TABULAR, DATA

	## Introduction

	This guide explains how you can run background tasks from your gradio app.
	Background tasks are operations that you'd like to perform outside the request-response
	lifecycle of your app either once or on a periodic schedule.
	Examples of background tasks include periodically synchronizing data to an external database or
	sending a report of model predictions via email.

	## Overview

	We will be creating a simple "Google-forms-style" application to gather feedback from users of the gradio library.
	We will use a local sqlite database to store our data, but we will periodically synchronize the state of the database
	with a [HuggingFace Dataset](https://huggingface.co/datasets) so that our user reviews are always backed up.
	The synchronization will happen in a background task running every 60 seconds.

	At the end of the demo, you'll have a fully working application like this one:

	<gradio-app space="freddyaboulton/gradio-google-forms"> </gradio-app>

	## Step 1 - Write your database logic 💾

	Our application will store the name of the reviewer, their rating of gradio on a scale of 1 to 5, as well as
	any comments they want to share about the library. Let's write some code that creates a database table to
	store this data. We'll also write some functions to insert a review into that table and fetch the latest 10 reviews.

	We're going to use the `sqlite3` library to connect to our sqlite database but gradio will work with any library.

	The code will look like this:

	```python
	DB_FILE = "./reviews.db"
	db = sqlite3.connect(DB_FILE)

	# Create table if it doesn't already exist
	try:
	db.execute("SELECT * FROM reviews").fetchall()
	db.close()
	except sqlite3.OperationalError:
	db.execute(
	'''
	CREATE TABLE reviews (id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
	created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
	name TEXT, review INTEGER, comments TEXT)
	''')
	db.commit()
	db.close()

	def get_latest_reviews(db: sqlite3.Connection):
	reviews = db.execute("SELECT * FROM reviews ORDER BY id DESC limit 10").fetchall()
	total_reviews = db.execute("Select COUNT(id) from reviews").fetchone()[0]
	reviews = pd.DataFrame(reviews, columns=["id", "date_created", "name", "review", "comments"])
	return reviews, total_reviews


	def add_review(name: str, review: int, comments: str):
	db = sqlite3.connect(DB_FILE)
	cursor = db.cursor()
	cursor.execute("INSERT INTO reviews(name, review, comments) VALUES(?,?,?)", [name, review, comments])
	db.commit()
	reviews, total_reviews = get_latest_reviews(db)
	db.close()
	return reviews, total_reviews
	```

	Let's also write a function to load the latest reviews when the gradio application loads:

	```python
	def load_data():
	db = sqlite3.connect(DB_FILE)
	reviews, total_reviews = get_latest_reviews(db)
	db.close()
	return reviews, total_reviews
	```

	## Step 2 - Create a gradio app ⚡

	Now that we have our database logic defined, we can use gradio create a dynamic web page to ask our users for feedback!

	```python
	with gr.Blocks() as demo:
	with gr.Row():
	with gr.Column():
	name = gr.Textbox(label="Name", placeholder="What is your name?")
	review = gr.Radio(label="How satisfied are you with using gradio?", choices=[1, 2, 3, 4, 5])
	comments = gr.Textbox(label="Comments", lines=10, placeholder="Do you have any feedback on gradio?")
	submit = gr.Button(value="Submit Feedback")
	with gr.Column():
	data = gr.Dataframe(label="Most recently created 10 rows")
	count = gr.Number(label="Total number of reviews")
	submit.click(add_review, [name, review, comments], [data, count])
	demo.load(load_data, None, [data, count])
	```

	## Step 3 - Synchronize with HuggingFace Datasets 🤗

	We could call `demo.launch()` after step 2 and have a fully functioning application. However,
	our data would be stored locally on our machine. If the sqlite file were accidentally deleted, we'd lose all of our reviews!
	Let's back up our data to a dataset on the HuggingFace hub.

	Create a dataset [here](https://huggingface.co/datasets) before proceeding.

	Now at the top of our script, we'll use the [huggingface hub client library](https://huggingface.co/docs/huggingface_hub/index)
	to connect to our dataset and pull the latest backup.

	```python
	TOKEN = os.environ.get('HUB_TOKEN')
	repo = huggingface_hub.Repository(
	local_dir="data",
	repo_type="dataset",
	clone_from="<name-of-your-dataset>",
	use_auth_token=TOKEN
	)
	repo.git_pull()

	shutil.copyfile("./data/reviews.db", DB_FILE)
	```

	Note that you'll have to get an access token from the "Settings" tab of your HuggingFace for the above code to work.
	In the script, the token is securely accessed via an environment variable.

	![access_token](https://github.com/gradio-app/gradio/blob/main/guides/assets/access_token.png?raw=true)

	Now we will create a background task to synch our local database to the dataset hub every 60 seconds.
	We will use the [AdvancedPythonScheduler](https://apscheduler.readthedocs.io/en/3.x/) to handle the scheduling.
	However, this is not the only task scheduling library available. Feel free to use whatever you are comfortable with.

	The function to back up our data will look like this:

	```python
	from apscheduler.schedulers.background import BackgroundScheduler

	def backup_db():
	shutil.copyfile(DB_FILE, "./data/reviews.db")
	db = sqlite3.connect(DB_FILE)
	reviews = db.execute("SELECT * FROM reviews").fetchall()
	pd.DataFrame(reviews).to_csv("./data/reviews.csv", index=False)
	print("updating db")
	repo.push_to_hub(blocking=False, commit_message=f"Updating data at {datetime.datetime.now()}")


	scheduler = BackgroundScheduler()
	scheduler.add_job(func=backup_db, trigger="interval", seconds=60)
	scheduler.start()
	```

	## Step 4 (Bonus) - Deployment to HuggingFace Spaces

	You can use the HuggingFace [Spaces](https://huggingface.co/spaces) platform to deploy this application for free ✨

	If you haven't used Spaces before, follow the previous guide [here](/using_hugging_face_integrations).
	You will have to use the `HUB_TOKEN` environment variable as a secret in the Guides.

	## Conclusion

	Congratulations! You know how to run background tasks from your gradio app on a schedule ⏲️.

	Checkout the application running on Spaces [here](https://huggingface.co/spaces/freddyaboulton/gradio-google-forms).
	The complete code is [here](https://huggingface.co/spaces/freddyaboulton/gradio-google-forms/blob/main/app.py)