Deploying Flows¶
Reminder to connect to Prefect Cloud or a self-hosted Prefect server instance
Some features in this tutorial, such as scheduling, require you to be connected to a Prefect server.
If using a self-hosted setup, run prefect server start
to run both the webserver and UI.
If using Prefect Cloud, make sure you have successfully authenticated your local environment.
Why deployments?¶
Some of the most common reasons to use an orchestration tool such as Prefect are for scheduling and event-based triggering. Up to this point, we’ve demonstrated running Prefect flows as scripts, but this means you have been the one triggering and managing flow runs. You can certainly continue to trigger your workflows in this way and use Prefect as a monitoring layer for other schedulers or systems, but you will miss out on many of the other benefits and features that Prefect offers.
Deploying a flow exposes an API and UI so that you can:
- trigger new runs, cancel active runs, pause scheduled runs, customize parameters, and more
- remotely configure schedules and automation rules for your deployments
- dynamically provision infrastructure using workers
What is a deployment?¶
Deploying a flow is the act of specifying where and how it will run. This information is encapsulated and sent to Prefect as a deployment that contains the crucial metadata needed for remote orchestration. Deployments elevate workflows from functions that you call manually to API-managed entities.
Attributes of a deployment include (but are not limited to):
- Flow entrypoint: path to your flow function
- Schedule or Trigger: optional schedule or triggering rules for this deployment
- Tags: optional text labels for organizing your deployments
Create a deployment¶
Using our get_repo_info
flow from the previous sections, we can easily create a deployment for it by calling a single method on the flow object: flow.serve
.
import httpx
from prefect import flow
@flow(log_prints=True)
def get_repo_info(repo_name: str = "PrefectHQ/prefect"):
url = f"https://api.github.com/repos/{repo_name}"
response = httpx.get(url)
response.raise_for_status()
repo = response.json()
print(f"{repo_name} repository statistics 🤓:")
print(f"Stars 🌠 : {repo['stargazers_count']}")
print(f"Forks 🍴 : {repo['forks_count']}")
if __name__ == "__main__":
get_repo_info.serve(name="my-first-deployment")
Running this script will do two things:
- create a deployment called "my-first-deployment" for your flow in the Prefect API
- stay running to listen for flow runs for this deployment; when a run is found, it will be asynchronously executed within a subprocess
Deployments must be defined in static files
Flows can be defined and run interactively, that is, within REPLs or Notebooks. Deployments, on the other hand, require that your flow definition be in a known file (which can be located on a remote filesystem in certain setups, as we'll see in the next section of the tutorial).
Because this deployment has no schedule or triggering automation, you will need to use the UI or API to create runs for it. Let's use the CLI (in a separate terminal window) to create a run for this deployment:
prefect deployment run 'get-repo-info/my-first-deployment'
If you are watching either your terminal or your UI, you should see the newly created run execute successfully!
Let's take this example further by adding a schedule and additional metadata.
Additional options¶
The serve
method on flows exposes many options for the deployment.
Let's use a few of these options now:
cron
: a keyword that allows us to set a cron string schedule for the deployment; see schedules for more advanced scheduling optionstags
: a keyword that allows us to tag this deployment and its runs for bookkeeping and filtering purposesdescription
: a keyword that allows us to document what this deployment does; by default the description is set from the docstring of the flow function, but we did not document our flow functionversion
: a keyword that allows us to track changes to our deployment; by default a hash of the file containing the flow is used; popular options include semver tags or git commit hashes
Let's add these options to our deployment:
if __name__ == "__main__":
get_repo_info.serve(
name="my-first-deployment",
cron="* * * * *",
tags=["testing", "tutorial"],
description="Given a GitHub repository, logs repository statistics for that repo.",
version="tutorial/deployments",
)
When you rerun this script, you will find an updated deployment in the UI that is actively scheduling work!
Stop the script in the CLI using CTRL+C
and your schedule will be automatically paused.
.serve
is a long-running process
For remotely triggered or scheduled runs to be executed, your script with flow.serve
must be actively running.
Running multiple deployments at once¶
This method is useful for creating deployments for single flows, but what if we have two or more flows? This situation only requires a few additional method calls and imports to get up and running:
import time
from prefect import flow, serve
@flow
def slow_flow(sleep: int = 60):
"Sleepy flow - sleeps the provided amount of time (in seconds)."
time.sleep(sleep)
@flow
def fast_flow():
"Fastest flow this side of the Mississippi."
return
if __name__ == "__main__":
slow_deploy = slow_flow.to_deployment(name="sleeper", interval=45)
fast_deploy = fast_flow.to_deployment(name="fast")
serve(slow_deploy, fast_deploy)
A few observations:
- the
flow.to_deployment
interface exposes the exact same options asflow.serve
; this method produces a deployment object - the deployments are only registered with the API once
serve(...)
is called - when serving multiple deployments, the only requirement is that they share a Python environment; they can be executed and scheduled independently of each other
Spend some time experimenting with this setup. A few potential next steps for exploration include:
- pausing and unpausing the schedule for the "sleeper" deployment
- using the UI to submit ad-hoc runs for the "sleeper" deployment with different values for
sleep
- cancelling an active run for the "sleeper" deployment from the UI (good luck cancelling the "fast" one 😉)
Hybrid execution option
Another implication of Prefect's deployment interface is that you can choose to use our hybrid execution model. Whether you use Prefect Cloud or host a Prefect server instance yourself, you can run work flows in the environments best suited to their execution. This model allows you efficient use of your infrastructure resources while maintaining the privacy of your code and data. There is no ingress required. For more information read more about our hybrid model.
Next steps¶
Congratulations! You now have your first working deployment.
Deploying flows through the serve
method is a fast way to start scheduling flows with Prefect.
However, if your team has more complex infrastructure requirements or you'd like to have Prefect manage flow execution, you can deploy flows to a work pool.
Learn about work pools and how Prefect Cloud can handle infrastructure configuration for you in the next step of the tutorial.