How to Install K3s with NVIDIA GPU Operator on Ubuntu 22.04

Updated on November 29, 2024
How to Install K3s with NVIDIA GPU Operator on Ubuntu 22.04 header image

Introduction

K3s is a lightweight, fully compliant Kubernetes distribution designed for simplified deployment and operation in resource-constrained environments. With a small memory footprint and a single-binary architecture, K3s is optimized for edge computing, IoT devices, and environments where traditional Kubernetes setups may be too resource-intensive. It streamlines Kubernetes by removing non-essential features and dependencies, making it faster and easier to install while retaining all core Kubernetes functionality. K3s is particularly valued for its ease of setup and efficient management, making Kubernetes accessible in scenarios that demand a minimal, reliable, and production-ready cluster solution.

This article explains how to install and configure K3s on Vultr Cloud GPU running on Ubuntu 22.04, and set up NVIDIA GPU Operator support on your system. You will install K3s, configure Kubernetes with Helm, and set up firewall rules to enable external access to the cluster. Additionally, you will install the NVIDIA GPU Operator to manage GPU resources within K3s for optimized GPU workloads.

Prerequisites

Before you begin:

Install K3s and NVIDIA GPU Operator

  1. Disable the Docker system service.

    console
    $ sudo systemctl disable docker
    
  2. Stop the Docker system service.

    console
    $ sudo systemctl stop docker
    
  3. View the Docker service status and verify that it's inactive.

    console
    $ sudo systemctl status docker
    

    Output:

    ○ docker.service - Docker Application Container Engine
         Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
         Active: inactive (dead) since Wed 2024-11-06 20:45:17 UTC; 5s ago
    TriggeredBy: ● docker.socket
           Docs: https://docs.docker.com
        Process: 1088 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=0/SUCCESS)
       Main PID: 1088 (code=exited, status=0/SUCCESS)
            CPU: 404ms
  4. Install K3s.

    console
    $ curl -sfL https://get.k3s.io | sh -
    

    Output:

    [INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
    [INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
    [INFO]  systemd: Enabling k3s unit
    Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
    [INFO]  systemd: Starting k3s
  5. Create a new .kube directory in your user home directory. Replace linuxuser with your actual user.

    console
    $ mkdir -p /home/linuxuser/.kube
    
  6. Add a k3s.yaml symbolic link to the config file in the .kube directory to set it as the default Kubernetes configuration file.

    console
    $ ln -s /etc/rancher/k3s/k3s.yaml /home/linuxuser/.kube/config
    
    • ln -s /etc/rancher/k3s/k3s.yaml /home/linuxuser/.kube/config: Creates a symbolic link to the K3s Kubernetes configuration file (k3s.yaml) in the default location (/home/linuxuser/.kube/config), allowing Kubernetes CLI tools (like kubectl) to access K3s.
  7. Change the .kube/config file permissions to 755 to enable Helm to load the configuration file.

    console
    $ sudo chmod 755 /home/linuxuser/.kube/config
    
  8. Install Helm to manage and install Kubernetes applications.

    console
    $ curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
    
  9. Add the NVIDIA Helm repository.

    console
    $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && helm repo update
    
    • helm repo add nvidia ...: Adds the NVIDIA Helm chart repository. This repository contains the GPU Operator and other NVIDIA tools for Kubernetes.
    • helm repo update: Updates Helm’s local repository cache, ensuring the latest package information is available.
  10. Install the NVIDIA GPU Operator.

    console
    $ helm install --wait gpu-operator nvidia/gpu-operator --create-namespace -n gpu-operator --set driver.enabled=false
    
    • helm install --wait gpu-operator nvidia/gpu-operator: Installs the NVIDIA GPU Operator, a Helm chart that helps manage NVIDIA GPU resources in Kubernetes.
    • --create-namespace -n gpu-operator: Creates a new namespace called gpu-operator for the GPU Operator resources.
    • --set driver.enabled=false: Installs the GPU operator without the NVIDIA driver. This is done because the driver is already installed on the system.
  11. Add all required firewall rules.

    console
    $ sudo ufw allow 6443/tcp && sudo ufw allow 30000:32767/tcp && sudo ufw allow 30000:32767/udp
    
  12. Enable the K3s system service.

    console
    $ sudo systemctl enable k3s
    
  13. View the K3s service status and verify that it's running.

    console
    $ sudo systemctl status k3s
    

    Output:

    ● k3s.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-11-06 20:45:47 UTC; 7min ago
       Docs: https://k3s.io
    Main PID: 2883 (k3s-server)
      Tasks: 206
     Memory: 3.9G
        CPU: 1min 19.382s
     CGroup: /system.slice/k3s.service
     ...................................

Conclusion

In this article, you installed and configured K3s on a Vultr Cloud GPU instance running Ubuntu 22.04 and set up NVIDIA GPU Operator support. Installed K3s, and configured Helm to manage Kubernetes applications. With the firewall rules configured, your cluster is now accessible, and the NVIDIA GPU Operator is set up to manage GPU resources within K3s, enabling optimized GPU workloads. Your lightweight Kubernetes cluster is now ready for deployment, providing a scalable and efficient solution in a resource-constrained environment.