Command Line Interface

Kubetorch offers a rich set of commands to offer you insight into running workloads at the individual and cluster level. For more details on the inputs, you can run kt <method> --help.

kubetorch.cli.config_command(action: str = <typer.models.ArgumentInfo object>, key: str = <typer.models.ArgumentInfo object>, value: str = <typer.models.ArgumentInfo object>)

Manage Kubetorch configuration settings.

Example:

$ kt config set username johndoe $ kt config set volumes "volume_name_one, volume_name_two" $ kt config set volumes volume_name_one $ kt config unset username $ kt config get username $ kt config list
kubetorch.cli.service_ssh(name: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>, pod: str = <typer.models.OptionInfo object>)

SSH into a remote service. By default, will SSH into the first pod.

Example:

$ kt ssh my_service
kubetorch.cli.service_list(namespace: str = <typer.models.OptionInfo object>, sort_by_updated: bool = <typer.models.OptionInfo object>, tag: str = <typer.models.OptionInfo object>)

List all Kubetorch services.

Example:

$ kt list $ kt list -t dev-branch
kubetorch.cli.metrics(service: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>)

Open Grafana Dashboard

kubetorch.cli.dashboard(local: bool = <typer.models.OptionInfo object>, namespace: str = <typer.models.OptionInfo object>)

Open Kubetorch Dashboard with Grafana Session Setup

kubetorch.cli.teardown(name: str = <typer.models.ArgumentInfo object>, yes: bool = <typer.models.OptionInfo object>, teardown_all: bool = <typer.models.OptionInfo object>, prefix: str = <typer.models.OptionInfo object>, namespace: str = <typer.models.OptionInfo object>, force: bool = <typer.models.OptionInfo object>)

Delete a service and all its associated resources (deployments, configmaps, etc).

Example:

$ kt teardown my-service -y # force teardown resources corresponding to service $ kt teardown --all # teardown all resources corresponding to username $ kt teardown --prefix test # teardown resources with prefix "test"
kubetorch.cli.service_status(name: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>)

Load a service’s status.

Example:

$ kt status my-service
kubetorch.cli.docs_command(output: str = <typer.models.ArgumentInfo object>)

Build kubetorch docs locally.

Example:

$ kt docs --output path/to/kt_docs
kubetorch.cli.check(name: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>)

Run a comprehensive health check for a deployed service.

Checks:

  • Deployment pod comes up and becomes ready (if not scaled to zero)

  • Rsync has succeeded

  • Service is marked as ready and service pod(s) are ready to serve traffic

  • GPU support configured (if applicable)

  • Loki/Alloy log streaming configuration (if applicable)

If a step fails, will dump kubectl describe and pod logs for relevant pods.

kubetorch.cli.describe_service(name: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>)

Show basic info for calling the service depending on whether an ingress is configured.

kubetorch.cli.deploy(target: str = <typer.models.ArgumentInfo object>)

Deploy a Python file or module to Kubetorch. This will deploy all functions and modules decorated with @kt.compute in the file or module.

kubetorch.cli.port_forward(name: str = <typer.models.ArgumentInfo object>, local_port: int = <typer.models.ArgumentInfo object>, remote_port: int = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>, pod: str = <typer.models.OptionInfo object>)

Port forward a local port to the specified Kubetorch service.

Example:

$ kt port-forward my-service $ kt port-forward my-service 32300 $ kt port-forward my-service -n custom-namespace $ kt port-forward my-service -p my-pod

This allows you to access the service locally using curl http://localhost:<port>.

kubetorch.cli.run(ctx: ~typer.models.Context, name: str = <typer.models.OptionInfo object>, run_async: bool = <typer.models.OptionInfo object>, file: int = <typer.models.OptionInfo object>)

Build and deploy a kubetorch app that runs the provided CLI command. In order for the app to be deployed, the file being run must be a Python file specifying a kt.app construction at the top of the file.

Example:

$ kt run python train.py --epochs 5
kubetorch.cli.list_queues(queue: str = <typer.models.OptionInfo object>, list: bool = <typer.models.OptionInfo object>, namespace: str = <typer.models.OptionInfo object>)

List pods that are currently queued, sorted by priority and creation timestamp.

Example:

$ kt queues $ kt queues -q default $ kt queues -l
kubetorch.cli.volume_actions(action: ~kubetorch.cli_utils.VolumeAction = <typer.models.ArgumentInfo object>, name: str = <typer.models.ArgumentInfo object>, storage_class: str = <typer.models.OptionInfo object>, size: str = <typer.models.OptionInfo object>, access_mode: str = <typer.models.OptionInfo object>, mount_path: str = <typer.models.OptionInfo object>, namespace: str = <typer.models.OptionInfo object>, all_namespaces: bool = <typer.models.OptionInfo object>)

Manage volumes used in Kubetorch deployments.

Examples:

$ kt volumes $ kt volumes -A $ kt volumes create my-vol $ kt volumes create my-vol -c gp3-csi -s 20Gi $ kt volumes delete my-vol $ kt volumes ssh my-vol
kubetorch.cli.debug(pod: str = <typer.models.ArgumentInfo object>, namespace: str = <typer.models.OptionInfo object>, port: int = <typer.models.OptionInfo object>)

Start an interactive debugging session on the pod, which will connect to the debug server inside the service. Before running this command, you must call a method on the service with pdb=True or add a kt.deep_breakpoint() call into your code to enable debugging.

kubetorch.cli.secret_actions(action: ~kubetorch.cli_utils.SecretAction = <typer.models.ArgumentInfo object>, name: str = <typer.models.ArgumentInfo object>, prefix: str = <typer.models.OptionInfo object>, namespace: str = <typer.models.OptionInfo object>, all_namespaces: bool = <typer.models.OptionInfo object>, yes: bool = <typer.models.OptionInfo object>, path: str = <typer.models.OptionInfo object>, provider: str = <typer.models.OptionInfo object>, env_vars: ~typing.List[str] = <typer.models.OptionInfo object>, show_values: bool = <typer.models.OptionInfo object>)

Manage secrets used in Kubetorch deployments. Examples: .. code-block:: bash

$ kt secrets # list secrets in the default namespace

$ kt secrets list -n my_namespace # list secrets in my_namespace namespace

$ kt secrets -A # list secrets in all namespaces

$ kt secrets create –provider aws # create a secret with the aws credentials in default namespace

$ kt secrets create my_secret -v ENV_VAR_1 -v ENV_VAR_2 -n my_namespace # create a secret using env vars

$ kt secrets delete my_secret -n my_namespace # delete a secret called my_secret from my_namespace namespace

$ kt secrets delete aws # delete a secret called aws from default namespace