Introduction to Vultr Cloud GPUs Powered by NVIDIA A16

Updated on July 17, 2024
Introduction to Vultr Cloud GPUs Powered by NVIDIA A16 header image

This reference guide provides information on Vultr's Cloud GPU support for the NVIDIA A16 GPU. Vultr is the first cloud computing provider to offer the NVIDIA A16 GPU and utilizes innovative virtualization to provide fractions of the GPU at an affordable price. This results in a game-changing product that delivers low-latency Windows and Linux virtual desktops with a high-performance NVIDIA GPU attached.

With Vultr Cloud GPUs, virtual desktop infrastructure (VDI) is now a practical and viable solution for businesses, offering near-native performance that has significantly improved from its previous reputation for poor latency and dropped frames. Vultr Cloud GPU instances featuring the NVIDIA A16 utilize NVIDIA Virtual PC (vPC) software to accelerate desktop applications, making them ideal for virtual desktops, transcoding, and more. Additionally, these GPUs excel in machine learning and AI inference tasks, providing high performance and low latency for real-time AI applications. Whether you are looking for a powerful virtual workstation, a solution for resource-intensive video encoding, or advanced inference capabilities, Vultr Cloud GPUs powered by NVIDIA A16 have you covered. The A16 is compatible with the Vultr GPU stack, which comes with pre-installed tools like Jupyter Lab and NVIDIA GPU Drivers. These tools enable the GPU to function properly and effectively.

The NVIDIA A16 GPU is a smaller and more cost-effective version of the NVIDIA A40 GPU designed for virtual desktop infrastructure (VDI), virtual applications (VApps), and headless apps in a forwarded system. This GPU is optimized for compute, workstation, and VDI workloads. This guide provides detailed information on how to use these GPUs for your specific needs.

Important Information on GPU Allocation

Each A16 card has four discrete GPU cores. When you deploy a single-GPU A16 plan, Vultr allocates one of the cores, and it will appear as one GPU on your cloud server. If you purchase a 4-GPU plan, you will receive the equivalent of a full NVIDIA A16 card, which will appear as four separate GPUs. It is important to note that all four allocations may not exist on the same card and can be spread across different cards in various combinations. For example, half of the allocations may exist on one card, all but one on another card, or all of them on the same card. If you require a full NVIDIA A16 card with all four allocations on one card, you should consider purchasing a Bare Metal plan.

Inference

Vultr's A16 GPUs are designed to handle high-performance inference tasks efficiently. Leveraging NVIDIA’s powerful GPU architecture, these instances provide excellent support for deploying and running machine learning models with real-time performance and low latency. The A16’s architecture is well-suited for a range of AI applications, including real-time data processing and complex neural network inference.

  • When you deploy AI and machine learning applications, the NVIDIA CUDA Toolkit is pre-installed, allowing you to accelerate computation and parallel processing tasks.
  • For optimized performance, ensure that the NVIDIA GPU Drivers are correctly installed and configured on your instance. The CUDA Toolkit provides the necessary libraries and tools to harness the full potential of the A16 for inference tasks.

RTX Virtual Workstation (vWS)

NVIDIA RTX Virtual Workstation (vWS) is a software solution that enables GPU-accelerated virtual desktops and applications with high-performance, professional-grade graphics. It provides remote access to powerful graphics processing capabilities on any device, anywhere, over a network connection. With NVIDIA RTX vWS, users can run resource-intensive applications such as 3D design, video editing, and scientific simulations in virtualized environments such as cloud data centers, on-premises servers, or workstations. The solution leverages the NVIDIA RTX platform, including Turing architecture and RT Cores, to deliver real-time ray tracing and AI-accelerated workflows for high-end graphics and performance. NVIDIA RTX vWS also supports multiple operating systems, making it a versatile solution for businesses and organizations.

Vultr's A16 GPUs are fully licensed with RTX Enterprise drivers and NVIDIA RTX vWS.

vGPU Profiles

A vGPU profile is a virtual GPU resource assigned to a virtual machine. The profile determines the amount of GPU frame buffer allocated to the virtual machine and is chosen to improve the cost, scalability, stability, and performance of the VDI environment. The vGPU profiles are grouped into different series, each optimized for a specific class of workload, and have a fixed amount of frame buffer, supported display heads, and maximum resolutions. You can learn more about the A16's vGPU profiles in the NVIDIA RTX Virtual Workstation Sizing Guide.

Vultr's A16 GPUs are deployed with the Q-profile, which is optimized for creative and technical professionals who require the performance and features of NVIDIA RTX Enterprise drivers.

Virtual Desktop Infrastructure (VDI)

Vultr supports VDI with Microsoft Windows and many Linux distributions with Gnome, KDE, and XFCE.

  • When you deploy Microsoft Windows, the drivers are already installed in your Cloud GPU server.
  • If you use Linux, you should run nvidia-xconfig after configuring your desktop. The nvidia-xconfig tool is an automated tool for configuring NVIDIA graphics. The command should be executed on its own to configure all necessary settings. There is generally no need to modify the tool's configuration. Modifications may be required in specific advanced scenarios, such as using streaming applications in headless mode. However, these modifications are outside the scope of typical usage for VDI.

Getting Started

Before getting started, you need to have the appropriate client installed and have the IP address, username, and password for the cloud GPU.

Your initial Cloud GPU access must be performed with one of two protocols, depending on your operating system of choice. After you have connected with VNC or RDP, you can install the VDI solution of your choice, such as Parsec, Teradici, XRPA, or others.

  • Microsoft Windows users must use Remote Desktop Protocol (RDP), which is a proprietary protocol developed by Microsoft that provides a graphical interface to another computer. Microsoft Windows users must have an RDP client installed, such as the built-in Remote Desktop Connection tool, to connect to a cloud GPU using RDP. The user will then need the IP address or hostname of the cloud GPU, along with the username and password, to access the GPU. Once the RDP client is open, enter the IP address and the credentials to connect to the cloud GPU.
  • Linux users must use Virtual Network Computing (VNC), an open-source remote desktop protocol that provides a graphical interface to another computer. To connect to a Linux Cloud GPU using VNC, you must have a VNC client installed, the IP address of the Cloud GPU, and the username and password. Once the VNC client is open, enter the IP address and the credentials, and you will be connected to the Cloud GPU.

Note: You cannot use the Vultr Web Console with a Vultr Cloud GPU.

NVIDIA Drivers

By default, Vultr installs NVIDIA vGPU Enterprise drivers on your Cloud GPU server.

NVIDIA Enterprise drivers are specifically designed for enterprise environments and offer features and capabilities that are not included in the standard consumer drivers. These features and capabilities are geared toward providing improved stability, performance, and support for enterprise-level applications and systems.

NVIDIA Enterprise drivers for the A16 GPU offer several benefits, including:

  • Certified compatibility: Drivers are certified and tested to work with a wide range of enterprise applications, ensuring compatibility and reducing the risk of issues arising from using untested or unverified drivers.
  • Improved stability: Drivers are optimized to provide a more stable experience, with a focus on reducing crashes, system hangs, and other stability-related issues.
  • Enhanced performance: Drivers offer improved performance for a range of enterprise-level applications, including data center and scientific computing, as well as machine learning and AI workloads.
  • Advanced features: Drivers offer advanced features, such as GPU Direct for Video, which enables high-quality video processing, and enhanced support for virtualized environments, including NVIDIA GRID and NVIDIA vGPU.
  • Dedicated enterprise support: Drivers come with dedicated support from NVIDIA, ensuring that enterprise customers have access to the resources and expertise they need to resolve any issues or questions that may arise.

In summary, the NVIDIA Enterprise drivers Vultr installed on your Cloud GPU provide a more stable, performant, and feature-rich experience for NVIDIA GPUs, including the A16.

Automatic Installation (Recommended Way)

  1. Download the installation script.

     $ curl -s -o /opt/nvidia/linux_gpu.sh https://apprepo.vultr.com/static/linux_gpu.sh
  2. Run the installation script.

     $ bash /opt/nvidia/linux_gpu.sh

Manual Installation

In some cases, you may wish to do your own GPU driver installation. Vultr compiles a custom Linux kernel to support NVIDIA Enterprise drivers, so you must follow the steps below when reinstalling drivers.

Prerequisites

  1. First, install the prerequisite packages:

    • For CentOS and RHEL-based distributions:

        $ yum install -y epel-release
        $ yum group install "Development Tools" -y
        $ yum install -y wget cmake pkg-config libglvnd-devel mesa-libGL-devel zlib kernel-headers kernel-devel
    • Ubuntu:

        $ apt-get update && apt-get install -y vim man wget unzip curl gnupg2 ca-certificates lsb-release apache2-utils \
        ethtool wget build-essential zlib1g cmake pkg-config libglvnd-dev libegl1 libopenblas-dev liblapack-dev \
        linux-headers-generic
    • Debian-based distributions other than Ubuntu:

        $ apt-get update && apt-get install -y vim man wget unzip curl gnupg2 ca-certificates lsb-release apache2-utils \
        ethtool wget build-essential zlib1g cmake pkg-config libglvnd-dev libegl1 libopenblas-dev liblapack-dev \
        linux-headers-amd64
  2. Disable the Nouveau drivers by running these commands: https://docs.vultr.com/public/doc-assets/legacy/7720/disable-nouveau.sh

  3. Install the licensing configuration files by running these commands: https://docs.vultr.com/public/doc-assets/legacy/7720/install-license.sh

  4. Edit /etc/nvidia/gridd.conf you created in the previous step.

     $ nano /etc/nvidia/gridd.conf
  5. Replace any instance of {{MACHERE}} with the MAC address of the Cloud GPUs primary network interface.

  6. Remove package vultr-nvidia-client-drivers if present.

     $ apt remove vultr-nvidia-client-drivers
  7. Download the new driver archive.

     $ wget -q -O /tmp/vultr_nvidia_driver.zip "http://169.254.169.254/latest/nvidia_linux_driver_installer_url"
  8. Extract the driver archive.

     $ sudo unzip -qq -o /tmp/vultr_nvidia_driver.zip -d /
  9. Remove the driver archive and install the driver.

     $ rm -f /tmp/vultr_nvidia_driver.zip
     $ bash /opt/nvidia/install.sh

Frequently Asked Questions

Does the A16 use Multi-Instance GPU (MIG), or Temporal Fractionalization?

Vultr uses Temporal fractionalization for the A16 GPU.

I deployed a Cloud GPU with four A16s, but they are on different physical cards.

If you purchase a 4-GPU plan, you will receive the equivalent of a full NVIDIA A16 card, which will appear as four separate GPUs. It is important to note that all four allocations may not exist on the same card and can be spread across different cards in various combinations. For example, half of the allocations may exist on one card, all but one on another card, or all of them on the same card.

If you require a full NVIDIA A16 card with all four allocations on one card, you should consider purchasing a Bare Metal plan.

Codec Support

The NVIDIA A16 Tensor Core GPU supports the following video codecs for both encoding (NVENC) and decoding (NVDEC) operations:

  • For encoding:
    • H.264 (AVC)
    • H.265 (HEVC)
  • For decoding:
    • H.264 (AVC)
    • H.265 (HEVC)
    • VP9
    • MPEG-2
    • VC-1
    • VP8