Image

The Image class, which lets you specify a pre-built base Image to use at launch time, as well as additional setup steps required for your program, such as installs and env vars.

Image Class

class kubetorch.Image(name: str | None = None, image_id: str | None = None, python_path: str | None = None)
__init__(name: str | None = None, image_id: str | None = None, python_path: str | None = None)

Kubetorch Image object, specifying cluster setup properties and steps.

Parameters:
  • name (str, optional) – Name to assign the Kubetorch image.

  • image_id (str, optional) – Machine image to use, if any. (Default: None)

  • python_path (str, optional) – Absolute path to the Python executable to use for remote server and installs. (Default: None)

Example

import kubetorch as kt my_image = ( kt.Image(name="base_image") .pip_install(["numpy", "pandas"]) .set_env_vars({"OMP_NUM_THREADS": 1}) )
from_docker(image_id: str)

Set up and use an existing Docker image.

Parameters:

image_id (str) – Docker image in the following format "<registry>/<image>:<tag>"

pip_install(reqs: List[Package | str], force: bool = False)

Pip install the given packages.

Parameters:
  • reqs (List[Package or str]) – List of packages to pip install on cluster and env.

  • force (bool, optional) – Whether to force re-install a package, if it already exists on the compute. (Default: False)

set_env_vars(env_vars: Dict)

Set environment variables.

Parameters:

env_vars (Dict) – Dict of environment variables and values to set.

sync_package(package: str, force: bool = False)

Sync local package over and add to path.

Parameters:
  • package (Package or str) – Package to sync. Either the name of a local editably installed package, or the path to the folder to sync over.

  • force (bool, optional) – Whether to re-sync the package over, if already previously synced over. (Default: False)

run_bash(command: str, force: bool = False)

Run bash commands.

Parameters:
  • command (str) – Shell commands to run on the cluster.

  • force (bool) – Whether to rerun the command on the cluster, if previously run in image setup already. (Default: False)

rsync(source: str, dest: str, contents: bool = False, filter_options: str | None = None, force: bool = False)

Sync the contents of the source directory into the destination.

Parameters:
  • source (str) – The source path.

  • dest (str) – The target path.

  • contents (bool, optional) – Whether the contents of the source directory or the directory itself should be copied to destination. If True the contents of the source directory are copied to the destination, and the source directory itself is not created at the destination. If False the source directory along with its contents are copied to the destination, creating an additional directory layer at the destination. (Default: False)

  • filter_options (str, optional) – The filter options for rsync. (Default: None)

  • force (bool) – Whether to force run the rsync command again, even if already rsynced previously in the image. (Default: False)