Python API

Compute

The Compute class allows you to define the resources and environment needed for your workloads, while controlling how the compute is managed and scaled based on demand. This includes specifying hardware requirements that can be either generic or tailored to your specific Kubernetes infrastructure and setup.

Resource Requirements: Specify any resource requests that your infrastructure supports, including the number of CPUs/GPUs, specific GPU types, memory, or disk size.

Base Environment: Highly customizable runtime dependencies through the Image class. Use a pre-built image or customize it at launch time with additional installations, environment variables, and setup commands.

Distribution & Scaling: Support for distributed computing patterns, including PyTorch and Ray distribution types, autoscaling configurations with replica and concurrency controls, and resource management that can automatically scale down or tear down idle services to optimize cluster utilization.

Kubernetes Configuration: Fine-grained control over Kubernetes-specific configurations such as namespace, labels/annotations, secrets, service accounts, and other RBAC features.

Production Controls: Freeze settings to prevent code syncs and updates for stable production deployments, ensuring consistent behavior across environments.

Image

The Image class enables you to define and customize the containerized environment for your workloads. You can specify a pre-built Docker image as your foundation and layer on additional setup steps that run at launch time, eliminating the need to rebuild images for every code change.

Image integrates with the Compute class to define your complete runtime environment. When deploying your workflow, the specified pre-built image serves as the base layer for the underlying service. Your additional setup steps then execute automatically before your application code runs.

Module

The Fn and Cls classes are wrappers around your locally defined Python functions and classes, respectively. Once wrapped, these objects can be sent .to(compute), which launches a service on your cluster (taking into account the compute requirements) and syncs over the necessary files to run the function remotely.

Seamless Development Workflow: After initial deployment, you can continue updating your Python code locally and redeploy with .to(), which re-syncs updates in seconds and returns a new function or class that is immediately ready to use.

Native Python Interface: The returned object is a callable that functions exactly like the original Python method, except it runs remotely on your remote compute rather than locally. When you call the object, it makes an HTTP request to the deployed service.

Built-in Observability: Kubetorch provides log streaming, flexible error handling, and observability features like traces and metrics. Debugging is simplified through fast iteration cycles and the ability to SSH directly into your compute or use PDB to set breakpoints for interactive debugging. See more about observability in the advanced install docs.

Deployment Modes

Kubetorch supports three deployment modes for modules.

(1) Deployment: Traditional Kubernetes deployments provide reliable, long-running services with built-in health checks and restart policies. This mode is ideal for production workloads that need consistent availability and can handle multiple replicas for load distribution. Deployments are perfect for stateless services, APIs, and background processing tasks that require guaranteed uptime.

(2) RayCluster: Distributed computing mode using Ray for parallel and distributed workloads. RayCluster is optimized for ML workloads that benefit from parallel execution across multiple nodes.

(3) Knative Service: Serverless deployment mode that can automatically scale from zero to handle varying traffic loads. Knative services are ideal for intermittent workloads, development environments, and applications with unpredictable usage patterns.

The deployment mode is automatically selected based on your compute configuration, or can be explicitly controlled through the Compute class parameters such as autoscaling settings and distribution configuration.

App

The App class wraps a Python CLI command or script, enabling you to run entire applications remotely on the cluster. Unlike Fn and Cls which wrap individual functions or classes, App deploys and executes complete Python files with all their dependencies, making it ideal for training scripts, data processing pipelines, or even web applications.

Flexible Application Support: App can handle any Python file. The entire application context is preserved, including command-line arguments, environment variables, and file dependencies.

Complete Environment Replication: When you deploy an app, Kubetorch syncs your local project files and requirements to the remote compute, ensuring your application runs in an environment identical to your local setup.

CLI Command Preservation: The App maintains the exact command-line interface of your local script, allowing you to pass arguments, flags, and options exactly as you would locally. This makes it seamless to convert existing scripts for remote execution without code modifications.

Background and Interactive Modes: Apps can run as background services for long-running processes or in interactive mode for development and debugging. You can monitor progress, stream logs, and even SSH into the running environment for real-time troubleshooting.

Secrets

Secrets such as provider keys and environment variables can be set when defining compute. These are set at launch time and accessible during the scope of your program.

Config

Kubetorch uses a local configuration file (stored at ~/.kt/config.yaml) to allow you to set global defaults for your services.

Viewing your config:

kt config

Updating your config:

kt config set <key> <value> # Example: kt config set namespace custom-namespace

You can override certain defaults in the Compute resource constructor for a specific service:

import kubetorch as kt my_service = kt.fn(my_func_or_cls).to( compute=kt.Compute( cpus="0.1", queue="preferred", # overrides the default queue namespace="my-namespace", # overrides the default namespace ) )