📦🔥 Kubetorch
Docs
Guide
Examples
API Reference
Login
Request Access
Kubetorch Examples
Hello, World
Training: PyTorch DDP
Inference: vLLM
Training
MNIST Torchvision
Automated Re-Training (Airflow)
Supervised Fine Tuning (Llama3)
Ray (Tune - HPO)
Ray (Train, Data - DLRM)
Lightning (ImageNet)
TensorFlow
XGBoost on GPU
Pytorch DDP (Resnet)
Fault Tolerance
Training Pod Preemption Recovery
Find Batch Size
Fail to Larger Compute
Reinforcement Learning
Basic GRPO with Kubetorch
Async GRPO
TRL with a Code Sandbox
VERL Training
Lauch Code Sandboxes
Inference
DeepSeek - vLLM
OpenAI OSS - Transformers
Triton Inference Server
Batch Embeddings
RAG App (Composite AI System)
404: Not Found
Copied