How to Install K3s with NVIDIA GPU Operator on Ubuntu 22.04
Introduction
K3s is a lightweight, fully compliant Kubernetes distribution designed for simplified deployment and operation in resource-constrained environments. With a small memory footprint and a single-binary architecture, K3s is optimized for edge computing, IoT devices, and environments where traditional Kubernetes setups may be too resource-intensive. It streamlines Kubernetes by removing non-essential features and dependencies, making it faster and easier to install while retaining all core Kubernetes functionality. K3s is particularly valued for its ease of setup and efficient management, making Kubernetes accessible in scenarios that demand a minimal, reliable, and production-ready cluster solution.
This article explains how to install and configure K3s on Vultr Cloud GPU running on Ubuntu 22.04, and set up NVIDIA GPU Operator support on your system. You will install K3s, configure Kubernetes with Helm, and set up firewall rules to enable external access to the cluster. Additionally, you will install the NVIDIA GPU Operator to manage GPU resources within K3s for optimized GPU workloads.
Prerequisites
Before you begin:
- Deploy a Vultr Cloud GPU running on Ubuntu 22.04.
- Access the server using SSH as a non-root user with sudo privileges.
Install K3s and NVIDIA GPU Operator
Disable the Docker system service.
console$ sudo systemctl disable docker
Stop the Docker system service.
console$ sudo systemctl stop docker
View the Docker service status and verify that it's inactive.
console$ sudo systemctl status docker
Output:
○ docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled) Active: inactive (dead) since Wed 2024-11-06 20:45:17 UTC; 5s ago TriggeredBy: ● docker.socket Docs: https://docs.docker.com Process: 1088 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=0/SUCCESS) Main PID: 1088 (code=exited, status=0/SUCCESS) CPU: 404ms
Install K3s.
console$ curl -sfL https://get.k3s.io | sh -
Output:
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env [INFO] systemd: Creating service file /etc/systemd/system/k3s.service [INFO] systemd: Enabling k3s unit Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service. [INFO] systemd: Starting k3s
Create a new
.kube
directory in your user home directory. Replacelinuxuser
with your actual user.console$ mkdir -p /home/linuxuser/.kube
Add a
k3s.yaml
symbolic link to theconfig
file in the.kube
directory to set it as the default Kubernetes configuration file.console$ ln -s /etc/rancher/k3s/k3s.yaml /home/linuxuser/.kube/config
ln -s /etc/rancher/k3s/k3s.yaml /home/linuxuser/.kube/config
: Creates a symbolic link to the K3s Kubernetes configuration file (k3s.yaml
) in the default location (/home/linuxuser/.kube/config
), allowing Kubernetes CLI tools (likekubectl
) to access K3s.
Change the
.kube/config
file permissions to755
to enable Helm to load the configuration file.console$ sudo chmod 755 /home/linuxuser/.kube/config
Install Helm to manage and install Kubernetes applications.
console$ curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
Add the NVIDIA Helm repository.
console$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && helm repo update
helm repo add nvidia ...
: Adds the NVIDIA Helm chart repository. This repository contains the GPU Operator and other NVIDIA tools for Kubernetes.helm repo update
: Updates Helm’s local repository cache, ensuring the latest package information is available.
Install the NVIDIA GPU Operator.
console$ helm install --wait gpu-operator nvidia/gpu-operator --create-namespace -n gpu-operator --set driver.enabled=false
helm install --wait gpu-operator nvidia/gpu-operator
: Installs the NVIDIA GPU Operator, a Helm chart that helps manage NVIDIA GPU resources in Kubernetes.--create-namespace -n gpu-operator
: Creates a new namespace called gpu-operator for the GPU Operator resources.--set driver.enabled=false
: Installs the GPU operator without the NVIDIA driver. This is done because the driver is already installed on the system.
Add all required firewall rules.
console$ sudo ufw allow 6443/tcp && sudo ufw allow 30000:32767/tcp && sudo ufw allow 30000:32767/udp
Enable the K3s system service.
console$ sudo systemctl enable k3s
View the K3s service status and verify that it's running.
console$ sudo systemctl status k3s
Output:
● k3s.service - Lightweight Kubernetes Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2024-11-06 20:45:47 UTC; 7min ago Docs: https://k3s.io Main PID: 2883 (k3s-server) Tasks: 206 Memory: 3.9G CPU: 1min 19.382s CGroup: /system.slice/k3s.service ...................................
Conclusion
In this article, you installed and configured K3s on a Vultr Cloud GPU instance running Ubuntu 22.04 and set up NVIDIA GPU Operator support. Installed K3s, and configured Helm to manage Kubernetes applications. With the firewall rules configured, your cluster is now accessible, and the NVIDIA GPU Operator is set up to manage GPU resources within K3s, enabling optimized GPU workloads. Your lightweight Kubernetes cluster is now ready for deployment, providing a scalable and efficient solution in a resource-constrained environment.