How to Deploy Domino Nexus Data Plane on Vultr Kubernetes Engine (VKE)

Updated on October 14, 2024
How to Deploy Domino Nexus Data Plane on Vultr Kubernetes Engine (VKE) header image

Introduction

Vultr and Domino Data Labs have established a strategic alliance, combining Domino's state-of-the-art MLOps platform with Vultr's high-performance cloud services.

Domino Nexus is a comprehensive platform that allows you to execute Data Science and Machine Learning workloads across any compute cluster - whether in the cloud, a specific region, or on-premises. By unifying data science silos across the enterprise, Domino Nexus provides a centralized hub for building, deploying, and monitoring models.

This article demonstates the steps to deploy a Nexus Data Plane on the Vultr Kubernetes Engine (VKE). It also covers establishing a connection to a Nexus Control Plane managed by Domino and running a sample workload on a node pool of virtual A100 GPU machines.

Prerequisites

Before you begin, you should:

  • Deploy a Kubernetes cluster at Vultr with atleast 2 node pools. Select Kubernetes v1.24 for Domino compatibility.

    • platform: Vultr Optimized Compute nodes running Domino Nexus Data services.
    • default: Vultr Cloud GPU nodes for executing workloads. You can opt for either GPU or Non-GPU servers.
  • Install kubectl and helm on your local machine or Kubernetes workspace.

Add Node Labels

To make a node groups available to Domino, Kubernetes worker nodes require distinct dominodatalab.com/node-pool label.

Fetch the available nodes.

$ kubectl get nodes

Output.

NAME                    STATUS   ROLES    AGE     VERSION
default-488904950846    Ready    <none>   3d21h   v1.24.11
default-7e4fcc3c7604    Ready    <none>   3d21h   v1.24.11
default-dc036fca3fc5    Ready    <none>   3d21h   v1.24.11
platform-3ba268b6eea7   Ready    <none>   3d21h   v1.24.11
platform-51347cd0966f   Ready    <none>   3d21h   v1.24.11
platform-9ff48b3b6af1   Ready    <none>   3d21h   v1.24.11

Add the dominodatalab.com/node-pool label to the platform nodes.

$ kubectl label nodes platform-3ba268b6eea7 dominodatalab.com/node-pool=platform
$ kubectl label nodes platform-51347cd0966f dominodatalab.com/node-pool=platform
$ kubectl label nodes platform-9ff48b3b6af1 dominodatalab.com/node-pool=platform

Output.

node/platform-3ba268b6eea7 labeled
node/platform-51347cd0966f labeled
node/platform-9ff48b3b6af1 labeled

Add the dominodatalab.com/node-pool label to the default nodes.

$ kubectl label nodes default-488904950846 dominodatalab.com/node-pool=default-gpu
$ kubectl label nodes default-7e4fcc3c7604 dominodatalab.com/node-pool=default-gpu
$ kubectl label nodes default-dc036fca3fc5 dominodatalab.com/node-pool=default-gpu

You can use any other value for the dominodatalab.com/node-pool label. However, the value must be unique for each node group.

Output.

node/default-488904950846 labeled
node/default-7e4fcc3c7604 labeled
node/default-dc036fca3fc5 labeled

Register a Data Plane via the Domino Admin UI

Access the admin section by clicking the wrench icon in the bottom-left corner.

Domino Admin Section

Click the "Register Data Plane" button

Register Data Plane Button

Provide a name and namespace for your data plane. If the namespace does not exist on the cluster, it will be created automatically.

In the Storage Class field, enter vultr-block-storage-retain. Domino created and manages this Storage Class. Refer to the Domino documentation for more information on registering a data plane.

Register Data Plane Form

Click the "Copy Command" button to copy the Helm command to your clipboard and then execute it on your cluster.

Copy Command Button

The process may take 5-10 minutes. Upon successful completion, you should see the output in your terminal.

NAME: data-plane
LAST DEPLOYED: Fri May 12 17:22:17 2023 
NAMESPACE: dataplane-gpu
STATUS: deployed
REVISION: 0 
TEST SUITE: None

The Domino GUI should also indicate that the Data Plane connection is healthy.

Healthy Data Plane

Create a Hardware Tier in Domino for the Data Plane

In the Domino Admin portal, hover over the "Advanced" tab and select "Hardware Tiers". Then, create a new Hardware Tier.

New Hardware Tier Button

Fill in the fields for Cluster, ID, Name, and Data Plane.

$ kubectl describe node [insert-node-name]

Cores, memory, and GPU should reflect the available resources on an individual worker node.

New Hardware Tier Form

Ensure that the Node Pool name matches the label applied to the nodes running your workloads. Leave the "Number of GPUs" field at 0 if running a Non-GPU workload.

Test the Data Plane

Click the Domino logo in the top corner to exit the Admin section.

Projects Section

Navigate to Projects > quick-start > Jobs > Run.

Run Job Button

Run a job using the file name main.py, the standard Compute Environment (Docker image), and the Hardware Tier you created from the dropdown. No other options are needed; Start the job from this page.

Start Job Form

Note: The first time you run a job in the Data Plane, the Compute Environment Docker image must be pulled from the Control Plane. It will be cached for subsequent runs. If you feel the process is taking too long, you can describe the run pod in the VKE cluster to check its progress.

Successful Job Completion

A green dot indicates a successful test.

Conclusion

This article demonstrated the steps to install Domino Nexus Data Plane on the Vultr Kubernetes Engine (VKE). It also covered establishing a connection to a Nexus Control Plane managed by Domino and running a sample workload on a node pool of virtual A100 GPU machines.

More Information