Runhouse has several abstractions to provide a seamless flow of code and execution across local and remote compute. The abstractions are cloud provider-agnostic, and provide living, reusable services.
The Cluster and Function APIs blur the line between program execution and deployment.
The Env and Package APIs help to provide convenient dependency isolation and management.
!pip install runhouse
import runhouse as rh
Optionally, to login to Runhouse to sync any secrets.
!runhouse login
Runhouse provides various APIs for interacting with remote clusters, such as terminating an on-demand cloud cluster or running remote CLI or Python commands from your local dev environment.
There are three types of supported cluster types:
Bring-your-own (BYO) Cluster: Existing clusters that you already have up, and access through an IP address and SSH credentials. Please refer to the Bring-Your-Own Cluster section for further instructions.
On-demand Cluster: Associated with your cloud account (AWS, GCP, Azure, LambdaLabs). There are additional features for these clusters, such as cluster (auto) stop. Please refer to On-Demand Cluster for instructions on first getting cloud credentials set up.
SageMaker Cluster (Alpha): Clusters that are created and managed through SageMaker, which can be used as a compute backend (just like BYO or On-Demand clusters) or for running dedicated training jobs. Please refer to the SageMaker Cluster section for instructions on getting setup with SageMaker.
Each cluster must be provided with a unique name
identifier during
construction. This name
parameter is used for saving down or loading
previous saved clusters, and also used for various CLI commands for the
cluster.
# BYO cluster
cluster = rh.cluster( # using private key
name="cpu-cluster",
ips=['<ip of the cluster>'],
ssh_creds={'ssh_user': '<user>', 'ssh_private_key':'<path_to_key>'},
)
cluster = rh.cluster( # using password
name="cpu-cluster",
ips=['<ip of the cluster>'],
ssh_creds={'ssh_user': '<user>', 'password':'******'},
)
# Using a Cloud provider
cluster = rh.ondemand_cluster(
name="cpu-cluster",
instance_type="CPU:8",
provider="cheapest", # "AWS", "GCP", "Azure", "Lambda", or "cheapest" (default)
autostop_mins=60, # Optional, defaults to default_autostop_mins; -1 suspends autostop
)
# Launch the cluster
cluster.up()
# Using SageMaker as the compute provider
cluster = rh.sagemaker_cluster(
name="sm-cluster",
profile="sagemaker" # AWS profile with a role ARN configured for SageMaker
)
# Launch the cluster
cluster.up()
You can set default configs for future cluster constructions. These defaults are associated with either only your local environment (if you don’t login to Runhouse), or can be reused across devices (if they are saved to your Runhouse account).
rh.configs.set('use_spot', False)
rh.configs.set('default_autostop', 30)
rh.configs.upload_defaults()
To run CLI or Python commands on the cluster:
cluster.run(['pip install numpy && pip freeze | grep numpy'])
INFO | 2023-08-29 03:35:44.910826 | Running command on cpu-cluster: pip install numpy && pip freeze | grep numpy
Warning: Permanently added '34.205.23.213' (ED25519) to the list of known hosts.
Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.25.2)
numpy==1.25.2
[(0, 'Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.25.2)nnumpy==1.25.2n', "Warning: Permanently added '34.205.23.213' (ED25519) to the list of known hosts.rn")]
cluster.run_python(['import numpy', 'print(numpy.__version__)'])
INFO | 2023-08-29 03:35:50.911455 | Running command on cpu-cluster: python3 -c "import numpy; print(numpy.__version__)"
1.25.2
[(0, '1.25.2n', '')]
To ssh into the cluster:
# Python
cluster.ssh()
# CLI
!ssh cpu-cluster
To tunnel a JupyterLab server into your local browser:
# Python
cluster.notebook()
# CLI
!runhouse notebook cpu-cluster
To open a port, if you want to run an application on the cluster that requires a port to be open, e.g. Tensorboard, Gradio:
cluster.ssh_tunnel(local_port=7860, remote_port=7860)
Runhouse’s Function API lets you define functions to be run on remote hardware (your cluster above!). Simply pass in a local (or a GitHub) function, the intended remote hardware, and any dependencies; Runhouse will handle the rest for you.
Let’s start with a simple local function getpid
, which takes in an
optional parameter a
and returns the process ID plus a
.
# Local Function
def getpid(a=0, b=0):
import os
return os.getpid() + a + b
To construct a function that runs getpid
on a remote cluster, we
wrap it using rh.function
, and specify system=cluster
.
# Remote Function
getpid_remote = rh.function(fn=getpid).to(system=cluster)
INFO | 2023-08-29 03:59:14.328987 | Writing out function function to /Users/caroline/Documents/runhouse/runhouse/docs/notebooks/basics/getpid_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2023-08-29 03:59:14.332706 | Setting up Function on cluster.
INFO | 2023-08-29 03:59:14.807140 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-08-29 03:59:15.280859 | Authentication (publickey) successful!
INFO | 2023-08-29 03:59:17.534412 | Found credentials in shared credentials file: ~/.aws/credentials
INFO | 2023-08-29 03:59:18.002794 | Checking server cpu-cluster
INFO | 2023-08-29 03:59:19.059074 | Server cpu-cluster is up.
INFO | 2023-08-29 03:59:19.061851 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 03:59:20.822780 | Calling env_20230829_035913.install
base servlet: Calling method install on module env_20230829_035913
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
INFO | 2023-08-29 03:59:22.728154 | Time to call env_20230829_035913.install: 1.91 seconds
INFO | 2023-08-29 03:59:22.981633 | Function setup complete.
To run the function, simply call it just as you would a local function, and the function automatically runs on your specified hardware!
print(f"local function result: {getpid()}")
print(f"remote function result: {getpid_remote()}")
INFO | 2023-08-29 03:59:43.821391 | Calling getpid.call
local function result: 7592
base servlet: Calling method call on module getpid
INFO | 2023-08-29 03:59:44.078775 | Time to call getpid.call: 0.26 seconds
remote function result: 1382396
A neat feature of Runhouse is the ability to take a function from a
Github repo, and create a wrapper around that function to be run on
remote. This saves you the effort of needing to clone or copy a
function. To do so, simply pass in the function url into
rh.function
.
We’ve implemented the same getpid
function
here.
Below, we demonstrate how we can directly use the GitHub link and
function name to run this function on remote hardware, without needing
to clone the repo ourselves or reimplement the function locally.
pid_git_remote = rh.function(
fn='https://github.com/run-house/runhouse/blob/main/docs/notebooks/sample_fn.py:getpid',
system=cluster,
)
INFO | 2023-08-29 04:00:01.870718 | Setting up Function on cluster.
INFO | 2023-08-29 04:00:01.873021 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:00:03.145979 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:00:04.625905 | Calling env_20230829_035957.install
base servlet: Calling method install on module env_20230829_035957
Installing package: GitPackage: https://github.com/run-house/runhouse.git@main
Pulling: git -C ./runhouse fetch https://github.com/run-house/runhouse.git
Checking out revision: git checkout main
Installing GitPackage: https://github.com/run-house/runhouse.git@main with method local.
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
INFO | 2023-08-29 04:00:08.100045 | Time to call env_20230829_035957.install: 3.47 seconds
INFO | 2023-08-29 04:00:08.275688 | Function setup complete.
pid_git_remote(1)
INFO | 2023-08-29 04:00:12.015937 | Calling getpid.call
base servlet: Calling method call on module getpid
INFO | 2023-08-29 04:00:12.285294 | Time to call getpid.call: 0.27 seconds
1382397
.remote
and .run
You can use fn.remote()
to have the function return a remote object,
rather than the proper result. This is a convenient way to avoid passing
large objects back and forth to your laptop, or to run longer execution
in notebooks without locking up the kernel.
getpid_remote_obj = getpid_remote.remote()
getpid_remote_obj
INFO | 2023-08-29 04:42:17.026532 | Calling getpid.call
base servlet: Calling method call on module getpid
INFO | 2023-08-29 04:42:17.900012 | Time to call getpid.call: 0.87 seconds
<runhouse.resources.blobs.blob.Blob at 0x154dab3d0>
To retrieve the data from the returned remote object, you can call
.fetch()
on the remote object.
getpid_remote_obj.fetch()
INFO | 2023-08-29 04:42:18.626515 | Getting getpid_call_20230829_044209_708686
INFO | 2023-08-29 04:42:18.780105 | Time to get getpid_call_20230829_044209_708686: 0.15 seconds
1382396
To run a function async, use fn.run()
, which returns a run_key
that can be used to retrieve the results and logs at a later point.
getpid_run_key = getpid_remote.run()
getpid_run_key
INFO | 2023-08-29 04:42:20.182323 | Calling getpid.call
INFO | 2023-08-29 04:42:20.318719 | Time to call getpid.call: 0.14 seconds
'getpid_call_20230829_044212_868665'
To retrieve the result of the function run, you can call
cluster.get()
and pass in the run_key
.
cluster.get(getpid_run_key)
INFO | 2023-08-29 04:42:28.747188 | Getting getpid_call_20230829_044212_868665
INFO | 2023-08-29 04:42:28.875886 | Time to get getpid_call_20230829_044212_868665: 0.13 seconds
1382396
stream_logs
To stream logs to local during the remote function call, pass in
stream_logs=True
to the function call.
getpid_remote(stream_logs=True)
INFO | 2023-08-29 04:43:17.812658 | Calling getpid.call
base servlet: Calling method call on module getpid
INFO | 2023-08-29 04:43:18.107531 | Time to call getpid.call: 0.29 seconds
1382396
Function logs are also automatically output onto a log file on cluster it is run on. You can refer to Runhouse Logging Docs for more information on accessing these logs.
In addition to running basic functions remotely, Runhouse lets you define classes that live and are run remotely, through the Module API.
If you have an existing non-Runhouse class that you would like to run remotely, you can do so as follows:
from package import Model
RemoteModel = rh.module(cls=Model, system=cluster)
remote_model = RemoteModel(model_id="bert-base-uncased", device="cuda")
remote_model.predict("Hello world!") # Runs on cluster
To construct your own Runhouse Module class, simply subclass your class
with rh.Module
, instantiate it locally, and then send it over to
your cluster.
Note that because this class is constructed locally prior to being sent over to a remote cluster, if there is a computationally heavy operation such as loading a dataset or model that we only want to be done remotely, it should be wrapped in a function and run remotely. One way of doing so is through lazy initialization, as in the data property of the module below.
# pid_module.py
import os
import runhouse as rh
class PIDModule(rh.Module):
def __init__(self, a: int=0):
super().__init__()
self.a = a
@property
def data(self):
if not hasattr(self, '_data'):
self._data = load_dataset()
return self._data
def getpid(self):
return os.getpid() + self.a
When working in a notebook setting, we define the class in another file,
pid_module.py
, and load in the module for use below. For Python
scripts, the class can be defined in the same file as the script.
from pid_module import PIDModule
remote_module = PIDModule(a=5).to(cluster)
remote_module.getpid()
INFO | 2023-09-05 19:57:10.034443 | Calling PIDModule.getpid
base servlet: Calling method getpid on module PIDModule
INFO | 2023-09-05 19:57:10.308916 | Time to call PIDModule.getpid: 0.27 seconds
21806
Our sample getpid
function used only builtin Python dependencies, so
we did not need to worry about the function environment.
For more complex functions relying on external dependencies, Runhouse provides concepts for packages (individual dependencies/installations) and environments (group of packages or a conda env).
Runhouse supports pip
, conda
, reqs
and git
packages,
which can be constructed in the following ways.
Often times, if using Packages in the context of environments (Envs), you don’t need to construct them yourself, but can just pass in the corresponding string, and Runhouse internals will handle the conversion and installation for you.
pip_package = rh.Package.from_string("pip:numpy")
conda_package = rh.Package.from_string("conda:torch")
reqs_package = rh.Package.from_string("reqs:./")
git_package = rh.GitPackage(git_url='https://github.com/huggingface/diffusers.git',
install_method='pip',
revision='v0.11.1')
You can also send packages between local, remote, and file storage.
local_package = rh.Package.from_string("local/path/to/folder")
package_on_s3 = local_package.to(system="s3", path="/s3/path/to/folder")
package_on_cluster = local_package.to(system=cluster)
Envs, or environments, keep track of your package installs and corresponding versions. This allows for reproducible dev environments, and convenient dependency isolation and management.
The basic Env resource just consists of a list of Packages, or strings
that represent the packages. You can additionally add any environment
variables by providing a Dict or .env
local file path, and also set
the working directory to be synced over (defaults to base GitHub repo).
env = rh.env(reqs=["numpy", reqs_package, git_package], env_vars={"USER": "*****"})
When you send an environment object to a cluster, the environment is automatically set up (packages are installed) on the cluster.
env_on_cluster = env.to(system=cluster)
INFO | 2023-08-29 04:44:06.955053 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:44:08.250678 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:44:09.741572 | Calling env_20230829_044402._set_env_vars
base servlet: Calling method _set_env_vars on module env_20230829_044402
INFO | 2023-08-29 04:44:10.028261 | Time to call env_20230829_044402._set_env_vars: 0.29 seconds
INFO | 2023-08-29 04:44:10.029212 | Calling env_20230829_044402.install
base servlet: Calling method install on module env_20230829_044402
Installing package: Package: numpy
Installing Package: numpy with method pip.
Running: pip install numpy
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
Installing package: GitPackage: https://github.com/huggingface/diffusers.git@v0.11.1
Pulling: git -C ./diffusers fetch https://github.com/huggingface/diffusers.git
Checking out revision: git checkout v0.11.1
Installing GitPackage: https://github.com/huggingface/diffusers.git@v0.11.1 with method pip.
Running: pip install ./diffusers
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
INFO | 2023-08-29 04:44:19.111342 | Time to call env_20230829_044402.install: 9.08 seconds
The CondaEnv resource represents a Conda environment that can be used to set up reproducible Conda envs across clusters.
There are several ways to construct a Runhouse CondaEnv object using
rh.conda_env
, by passing in any of the following into the
conda_env
parameter:
A yaml file corresponding to a conda environment config
A dict corresponding to a conda environment config
Name of an existing conda env on your local machine
Leaving the argument empty. In this case, we’ll construct a new Conda
environment for you, using the list you pass into reqs
.
Beyond the conda config, you can also add any additional requirements
you’d like to install in the environment by adding
reqs = List[packages]
.
# 1. config yaml file
conda_env = rh.conda_env(conda_env="conda_env.yml", reqs=["numpy", "diffusers"], name="yaml_env")
# 2. config dict
conda_dict = {"name": "conda_env", "channels": ["conda-forge"], "dependencies": ["python=3.10.0"]}
conda_env = rh.env(conda_env=conda_dict, name="dict_env")
# 3. local conda env
conda_env = rh.conda_env(conda_env="local_conda_env", name="from_local_env")
# 4. empty, construct from reqs
conda_env = rh.conda_env(reqs=["numpy", "diffusers"], name="new_env")
As with the base env, we can set up a conda env on the cluster with:
conda_env_on_cluster = conda_env.to(system=cluster)
INFO | 2023-08-29 04:48:21.600485 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 04:48:23.132095 | Calling new_env.install
new_env servlet: Calling method install on module new_env
Env already installed, skipping
INFO | 2023-08-29 04:48:24.358608 | Time to call new_env.install: 1.23 seconds
Previously in the cluster section, we mentioned several cluster APIs such as running CLI or Python commands. These all run on the base environment in the examples above, but now that we’ve defined a Conda env, let’s demonstrate how we can accomplish this inside a Conda env on the cluster:
# run Python command within the conda env
cluster.run_python(['import numpy', 'print(numpy.__version__)'], env=conda_env)
INFO | 2023-08-29 05:14:08.725396 | Running command on cpu-cluster: conda run -n new_env python3 -c "import numpy; print(numpy.__version__)"
1.25.2
[(0, '1.25.2nn', '')]
# install additional package on given env
cluster.install_packages(["pandas"], env=conda_env)
Now that we understand how clusters, functions, and packages/environments work, we can go on to implement more complex functions that require external dependencies, and seamlessly run them on a remote cluster.
def add_lists(list_a, list_b):
import numpy as np
return np.add(np.array(list_a), np.array(list_b))
Note that in the function defined, we include the import statement
import numpy as np
within the function. The import needs to be
inside the function definition in notebook or interactive environments,
but can be outside the function if being used in a Python script.
env = rh.env(reqs=["numpy"])
add_lists_remote = rh.function(fn=add_lists).to(system=cluster, env=env)
list_a = [1, 2, 3]
list_b = [2, 3, 4]
add_lists_remote(list_a, list_b)
INFO | 2023-08-29 05:20:27.959315 | Writing out function function to /Users/caroline/Documents/runhouse/runhouse/docs/notebooks/basics/add_lists_fn.py. Please make sure the function does not rely on any local variables, including imports (which should be moved inside the function body).
INFO | 2023-08-29 05:20:27.962973 | Setting up Function on cluster.
INFO | 2023-08-29 05:20:27.965670 | Copying package from file:///Users/caroline/Documents/runhouse/runhouse to: cpu-cluster
INFO | 2023-08-29 05:20:29.406978 | Calling env_20230829_052021.install
base servlet: Calling method install on module env_20230829_052021
Installing package: Package: numpy
Installing Package: numpy with method pip.
Running: pip install numpy
Installing package: Package: runhouse
Installing Package: runhouse with method reqs.
reqs path: runhouse/requirements.txt
pip installing requirements from runhouse/requirements.txt with: -r runhouse/requirements.txt
Running: /opt/conda/bin/python3.10 -m pip install -r runhouse/requirements.txt
INFO | 2023-08-29 05:20:32.575986 | Time to call env_20230829_052021.install: 3.17 seconds
INFO | 2023-08-29 05:20:32.774676 | Function setup complete.
INFO | 2023-08-29 05:20:32.791597 | Calling add_lists.call
base servlet: Calling method call on module add_lists
INFO | 2023-08-29 05:20:33.086075 | Time to call add_lists.call: 0.29 seconds
array([3, 5, 7])
Now that you understand the basics, feel free to play around with more complicated scenarios! You can also check out our additional API and example usage tutorials on our docs site.
To terminate the cluster, you can call sky down cluster-name
in CLI
or cluster_obj.teardown()
in Python.
!sky down cpu-cluster
# or
cluster.teardown()