“LudicrouslyFastAPI:” Deploy Python Functions as APIs on your own Infrastructure with Runhouse

What if we made building and deploying FastAPI apps even faster? 🤯

Photo of Donny Greenberg
Donny Greenberg

CEO @ 🏃‍♀️Runhouse🏠

Published April 5, 2024
A young woman driving an F1 racecar

FastAPI and Flask are elegant and effective tools, allowing you to stand up a new API endpoint on localhost in minutes just by defining a few functions. Unfortunately, serving the endpoint on localhost is usually where the fun ends. There are some daunting gaps between you and a shareable service from there:

  • DevOps learning curve - The minimum bar for publishing an app to share with others is simply way too high. AI researchers and engineers generally do not want to learn all about Docker, Terraform, Nginx, Authentication, Certificates, Telemetry, etc.
A tweet: "I have somehow become competent with docker but it still feels like I’m beating rocks together."
  • Boilerplate and repetition - Even once you’ve learned your way around DevOps, packaging your Python function into a service requires reintroducing that boilerplate over and over. The middleware and deployment must be wired up in each new application.
  • Iteration and debugging - Debugging through HTTP calls is challenging and the iteration loop is slow - at best you’re restarting the FastAPI server, at worst you’re rebuilding all your artifacts and deploying anew.

What if you could turn any Python into a service effortlessly, without the DevOps knowhow, boilerplate, or DevX hit?

A tweet: i put every project i build on fastapi even if i think i dont need it. always a moment where you go "oh i wish this had an api endpoint"

Deploying and Serving with Runhouse

We’re excited to share a new suite of features in Runhouse to provide this zero-to-serving experience, taking you from a Python function or class to an endpoint serving on your own infrastructure. All this, plus a high-iteration debuggable dev experience with lightning fast redeployment. Existing Runhouse users will recognize the APIs - simply send your Python function to your infrastructure and we’ll give you an endpoint back that you can call or share with others.

# welcome_app.py import runhouse as rh def welcome(name: str = "home"): return f"Welcome {name}!" if __name__ == "__main__": cpu_box = rh.ondemand_cluster(name="my-cpu", instance_type="CPU:2", provider="aws", open_ports=[443], server_connection_type="tls", autostop_mins=-1).up_if_not() remote_fn = rh.function(welcome).to(cpu_box) print(remote_fn("home")) print(remote_fn.endpoint())

Et voila, you have a service running on the cloud of your choice (in this case, AWS) which can handle thousands of requests per second. In this short stretch of code, Runhouse is:

  1. Launching the compute using your local cloud credentials (via SkyPilot), including opening a port, and installing Runhouse
  2. Installing Caddy, creating self-signed certificates to secure traffic with HTTPS (you can provide your own domain as well), and starting Caddy as a reverse proxy
  3. Starting a Runhouse HTTP server daemon on the remote machine
  4. Creating the “env” in which the function will serve - syncing over code and installing dependencies
  5. Deploying the function into the runhouse server
  6. Giving you a client to call and debug the service, as well as an HTTP endpoint

When we rerun the script, most of the above is aggressively cached and hot-restarted to provide a snappy and high-iteration experience directly on the deployed app. This is much of the magic of Runhouse - we’ve worked for the last year to achieve what we feel is the best possible DevX for deploying Python apps. If we made a code change and reran the script, you can see that the new app is redeployed instantly, faster than it would take to restart FastAPI or Flask on localhost!

$ python welcome_app.py INFO | 2024-04-02 06:04:09.956727 | Server my-cpu is up. INFO | 2024-04-02 06:04:09.957834 | Copying package from file:///Users/donny/code/rh_samples to: my-cpu INFO | 2024-04-02 06:04:11.094162 | Calling base_env.install INFO | 2024-04-02 06:04:11.686541 | Time to call base_env.install: 0.59 seconds INFO | 2024-04-02 06:04:11.699101 | Sending module welcome to my-cpu INFO | 2024-04-02 06:04:11.746858 | Calling welcome.call INFO | 2024-04-02 06:04:11.766914 | Time to call welcome.call: 0.02 seconds Welcome home! https://54.173.54.42/welcome $ curl -k https://54.173.54.42/welcome/call -X POST -d '{"name":"home!"}' -H 'Content-Type: application/json' "Welcome home!"%

Runhouse is not meant to replace FastAPI - it’s built on top of it to provide more extensive automation, faster deployment (e.g. hot-restarts), and more built-in middleware. It’s a great place to start with any project, and likely all you need for a large percentage of them, especially internal tools you’ll share among your team. If you eventually decide you’d like to deploy your app through an existing DevOps flow or into an existing container, you don’t need to rework your entire app, as Runhouse is also fully capable of local serving like a FastAPI or Flask app (and like FastAPI, takes advantage of Python async for high performance):

$ runhouse start --screen $ runhouse status 😈 Runhouse Daemon is running 🏃 • server_port: 32300 • den_auth: False • server_connection_type: none • backend config: • use_local_telemetry: False • domain: None • server_host: 0.0.0.0 • ips: ['0.0.0.0'] • resource_subtype: Cluster Serving 🍦 : base (Env): This environment has no resources.
# welcome_local.py import runhouse as rh def welcome(name: str = "home"): return f"Welcome {name}!" if __name__ == "__main__": remote_fn = rh.function(welcome).to(rh.here) print(remote_fn("home"))

Auth and Discovery in Den

The above examples produce a deployed service akin to a bare FastAPI or Flask app, meaning that they don’t include any DevOps components that we’d want to host separately from the app itself. For example, they are not secured with authentication, which is okay if we’re in a VPC or service mesh, but very problematic if the app is exposed to the public. They also lack service discovery, meaning that they aren’t at a consistent endpoint if we take down the VM and deploy anew later, causing dependent code to be brittle.

We provide these features out of the box through a free hosted service called Runhouse Den, which we’ll continue to expand with more DevOps features such as profiling, monitoring, usage metering, and more. Keep in mind that your app still stays inside of your own infrastructure.

You can create a Runhouse Den account and save your token locally by running:

$ runhouse login

Once logged in, you can enable authentication by passing den_auth=True to the cluster constructor. All calls to the cluster will now require a Runhouse token (which the Python client takes care of for you), and you can grant others access either in the Den UI or by calling:

# To share just the service, but not access to the underlying cluster remote_fn.share(“friend@email.com”) # To share the cluster, including SSH access cluster.share(“friend@email.com”)

Those users can now send HTTP requests to the cluster with their Runhouse token, or reconstruct the original Python client objects:

remote_fn = rh.function(name=”/your_username/my_fn_name”) cluster = rh.cluster(name=”/your_username/my_cluster_name”)

If you share the cluster with write access, they can even deploy new functions to it or SSH directly into the cluster:

$ ​​runhouse ssh "/your_username/my_cluster_name"

This solves discovery too - when an application loads and calls these clients, the calls will flow to the most recently saved endpoint. This is particularly useful if you’re developing an application locally which calls other services or want to share common services among a team.

Plenty to do

Runhouse is a powerful way to quickly stand up Python services on your own infrastructure. Our mission is to make it easy for anyone to build Python apps on any infrastructure, and we have an exciting roadmap ahead:

  1. Profiling and metering - We have basic telemetry in place, but we don’t support rate limiting, billing, or waterfall profiling of apps. We have the foundation in place and would like to add these soon.
  2. More backends - Broadcasting updates to many replicas (e.g. in Serverless deployments) and saving artifacts while preserving a deployment time in the seconds is a technical challenge and huge opportunity.

Getting help

If you have questions, feedback, or are interested in contributing, feel free to open a Github issue, message us on Discord, or email me at donny@run.house.

Stay up to speed 🏃‍♀️📩

Subscribe to our newsletter to receive updates about upcoming Runhouse features and announcements.

Read More

A woman on a horse lassoing a server rack

Better GPU Cluster Scheduling with Runhouse

March 14, 2024
Photo of Donny Greenberg
Donny Greenberg
An image of a woman running in a field, overlayed by the text ".to('cuda')" with the word "cuda" crossed out and replaced with "anywhere"

A PyTorch Approach to ML Infrastructure

June 29, 2023
Photo of Donny Greenberg
Donny Greenberg
AI generated image of "Pixar running woman in black ninja yoroi holding shiny keys, mysterious color palette, detailed, 8k, holding shiny gold keys"

Simple, Secure Secrets Management for AI

December 22, 2023
Photo of Caroline Chen
Caroline Chen