This tutorials walks through Runhouse setup (installation, hardware setup, and optional login) and goes through an example that demonstrates how to user Runhouse to bridge the gap between local and remote compute, and create Resources that can be saved, reused, and shared.
Runhouse can be installed with:
!pip install runhouse
If using Runhouse with a cloud provider, you can additionally install cloud packages (e.g. the right versions of tools like boto, gsutil, etc.):
$ pip install "runhouse[aws]"
$ pip install "runhouse[gcp]"
$ pip install "runhouse[azure]"
$ pip install "runhouse[sagemaker]"
# Or
$ pip install "runhouse[all]"
To import runhouse:
import runhouse as rh
# Optional: to sync over secrets from your Runhouse account
# !runhouse login
Runhouse provides APIs and Secrets management to make it easy to interact with your clusters. This can be either an existing, on-prem cluster you have access to, or cloud instances that Runhouse spins up/down for you (through your own cloud account).
Note that Runhouse is NOT managed compute. Everything runs inside your own compute and storage, using your credentials.
If you are using an existing, on-prem cluster, no additional setup is needed. Just have your cluster IP address and path to SSH credentials or password ready:
# using private key
cluster = rh.cluster(
name="cpu-cluster",
ips=['<ip of the cluster>'],
ssh_creds={'ssh_user': '<user>', 'ssh_private_key':'<path_to_key>'},
)
# using password
cluster = rh.cluster(
name="cpu-cluster",
ips=['<ip of the cluster>'],
ssh_creds={'ssh_user': '<user>', 'password':'******'},
)
Note
For more information see the Cluster Class section.
For on-demand clusters through cloud accounts (e.g. AWS, Azure, GCP, LambdaLabs), Runhouse uses SkyPilot for much of the heavy lifting with launching and terminating cloud instances.
To set up your cloud credentials locally to be able to use on-demand cloud clusters, you can either:
Use SkyPilot’s CLI command !sky check
, which provides
instructions on logging in or setting up your local config file,
depending on the provider (further SkyPilot instructions
here)
Use Runhouse’s Secrets API to sync your secrets down into the appropriate local config.
# SkyPilot CLI
!sky check
# Runhouse Secrets
# Lambda Labs:
rh.Secrets.save_provider_secrets(secrets={"lambda": {"api_key": "*******"}})
# AWS:
rh.Secrets.save_provider_secrets(secrets={"aws": {"access_key": "******", "secret_key": "*******"}})
# GCP:
!gcloud init
!gcloud auth application-default login
!cp -r /content/.config/* ~/.config/gcloud
# Azure
!az login
!az account set -s <subscription_id>
To check that the provider credentials are properly configured locally,
run sky check
to confirm that the cloud provider is enabled
!sky check
To create a cluster instance, use the rh.cluster()
factory function
for an existing cluster, or rh.ondemand_cluster
for an on-demand
cluster. We go more in depth about how to launch the cluster, and run a
function on it later in this tutorial.
cluster = rh.ondemand_cluster(
name="cpu-cluster",
instance_type="CPU:8",
provider="cheapest", # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
).save()
Note
For more information and hardware setup see the OnDemandCluster Class section.
Runhouse facilitates easy access to existing or new SageMaker compute. Just provide your SageMaker execution role ARN or have it configured in your local environment.
# Launch a new SageMaker instance and keep it up indefinitely
cluster = rh.sagemaker_cluster(name='sm-cluster', profile="sagemaker").save()
# Running a training job with a provided Estimator
pytorch_estimator = PyTorch(entry_point='train.py',
role='arn:aws:iam::123456789012:role/MySageMakerRole',
source_dir='/Users/myuser/dev/sagemaker',
framework_version='1.8.1',
py_version='py36',
instance_type='ml.p3.2xlarge')
cluster = rh.sagemaker_cluster(name='sagemaker-cluster',
estimator=pytorch_estimator).save()
Note
For more information and hardware setup see the SageMakerCluster Class section.
Using Runhouse with only the OSS Python package is perfectly fine, but you can unlock some unique portability features by creating an (always free) account and saving down your secrets and/or resource metadata there.
Think of the OSS-package-only experience as akin to Microsoft Office, while creating an account will make your cloud resources sharable and accessible from anywhere like Google Docs.
For instance, if you previously set up cloud provider credentials in
order for launching on-demand clusters, simply call runhouse login
or rh.login()
and choose which of your secrets you want to sync into
your Runhouse account. Then, from any other environment, you can
download those secrets and use them immediately, without needing to set
up your local credentials again. To delete any local credentials or
remove secrets from Runhouse, you can call runhouse logout
or
rh.logout()
.
Some notes on security:
Our API servers only ever store light metadata about your resources (e.g. folder name, cloud provider, storage bucket, path). All actual data and compute stays inside your own cloud account and never hits our servers.
Secrets are stored in Hashicorp Vault (an industry standard for secrets management), never on our API servers, and our APIs simply call into Vault’s APIs.
!runhouse login
# or
rh.login()
In the following example, we demonstrate Runhouse’s simple but powerful compute APIs to run locally defined functions on a remote cluster launched through Runhouse, bridging the gap between local and remote. Additionally, save, reuse, and share any of your Runhouse Resources.
Please first make sure that you have successfully followed the Installation and Cluster Setup sections above prior to running this example.
import runhouse as rh
First let’s define a simple local function which returns the number of CPUs available.
def num_cpus():
import multiprocessing
return f"Num cpus: {multiprocessing.cpu_count()}"
num_cpus()
'Num cpus: 10'
Next, instantiate the cluster that we want to run this function on. This can be either an existing cluster where you pass in an IP address and SSH credentials, or a cluster associated with supported Cloud account (AWS, GCP, Azure, LambdaLabs), where it is automatically launched (and optionally terminated) for you.
# Using an existing, bring-your-own cluster
cluster = rh.cluster(
name="cpu-cluster",
ips=['<ip of the cluster>'],
ssh_creds={'ssh_user': '<user>', 'ssh_private_key':'<path_to_key>'},
)
# Using a Cloud provider
cluster = rh.cluster(
name="cpu-cluster",
instance_type="CPU:8",
provider="cheapest", # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
)
If using a cloud cluster, we can launch the cluster with .up()
or
.up_if_not()
.
Note that it may take a few minutes for the cluster to be launched through the Cloud provider and set up dependencies.
cluster.up_if_not()
Now that we have our function and remote cluster set up, we’re ready to see how to run this function on our cluster!
We wrap our local function in rh.function
, and associate this new
function with the cluster. Now, whenever we call this new function, just
as we would call any other Python function, it runs on the cluster
instead of local.
num_cpus_cluster = rh.function(name="num_cpus_cluster", fn=num_cpus).to(system=cluster, reqs=["./"])
INFO | 2023-08-29 03:03:52.826786 | Writing out function function to /Users/caroline/Documents/runhouse/runhouse/docs/notebooks/basics/num_cpus_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body). /Users/caroline/Documents/runhouse/runhouse/runhouse/rns/function.py:106: UserWarning:reqs
andsetup_cmds
arguments has been deprecated. Please useenv
instead. warnings.warn( INFO | 2023-08-29 03:03:52.832445 | Setting up Function on cluster. INFO | 2023-08-29 03:03:53.271019 | Connected (version 2.0, client OpenSSH_8.2p1) INFO | 2023-08-29 03:03:53.546892 | Authentication (publickey) successful! INFO | 2023-08-29 03:03:53.557504 | Checking server cpu-cluster INFO | 2023-08-29 03:03:54.942843 | Server cpu-cluster is up. INFO | 2023-08-29 03:03:54.948097 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster INFO | 2023-08-29 03:03:56.480770 | Calling env_20230829_030349.install
base servlet: Calling method install on module env_20230829_030349
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
INFO | 2023-08-29 03:03:58.230209 | Time to call env_20230829_030349.install: 1.75 seconds
INFO | 2023-08-29 03:03:58.462054 | Function setup complete.
num_cpus_cluster()
INFO | 2023-08-29 03:04:01.105011 | Calling num_cpus_cluster.call
base servlet: Calling method call on module num_cpus_cluster
INFO | 2023-08-29 03:04:01.384439 | Time to call num_cpus_cluster.call: 0.28 seconds
'Num cpus: 8'
Runhouse supports saving down the metadata and configs for resources like clusters and functions, so that you can load them from a different environment, or share it with your collaborators.
num_cpus_cluster.save()
<runhouse.resources.function.Function at 0x104634ee0>
num_cpus_cluster.share(
users=["<email_to_runhouse_account>"],
access_type="write",
)
Now, you, or whoever you shared it with, can reload this function from another dev environment (like a different Colab, local, or on a cluster), as long as you are logged in to your Runhouse account.
reloaded_function = rh.function(name="num_cpus_cluster")
reloaded_function()
INFO | 2023-08-29 03:04:24.820884 | Checking server cpu-cluster
INFO | 2023-08-29 03:04:25.850301 | Server cpu-cluster is up.
INFO | 2023-08-29 03:04:25.852478 | Calling num_cpus_cluster.call
base servlet: Calling method call on module num_cpus_cluster
INFO | 2023-08-29 03:04:26.127098 | Time to call num_cpus_cluster.call: 0.27 seconds
'Num cpus: 8'
To terminate the cluster, you can run:
cluster.teardown()
⠇ Terminating cpu-cluster
In this tutorial, we demonstrated how to use runhouse to create references to remote clusters, run local functions on the cluster, and save/share and reuse functions with a Runhouse account.
Runhouse also lets you:
Send and save data (folders, blobs, tables) between local, remote, and file storage
Send, save, and share dev environments
Reload and reuse saved resources (both compute and data) from different environments (with a Runhouse account)
… and much more!