Introduction to Vultr Cloud GPU

Updated on September 29, 2023
Introduction to Vultr Cloud GPU header image

A Graphical Processing Unit (GPU) is specialized hardware initially designed for computer graphics and image processing. Their highly parallel structure makes them more efficient than general-purpose Central Processing Units (CPUs) for parallel algorithms that process large blocks of data. Traditionally, you must install a server on-premise with one or more GPUs to access this power. Purchasing hardware is expensive and inflexible, but there is an alternative.

Getting Started

If you'd like to jump right in, see the Vultr Cloud GPU Quickstart, which explains how to deploy a Cloud GPU or GPU Compute Marketplace App. NVIDIA also has a large catalog of software containers, pre-trained AI models, and Jupyter Notebooks ready to deploy on Vultr Cloud GPUs to get you started.

About Vultr Cloud GPU

Working in close collaboration with NVIDIA, we developed the Vultr Cloud GPU platform powered by NVIDIA GPUs and NVIDIA AI Enterprise software. Instead of attaching an entire physical GPU to a cloud server, we attach a fraction in the form of a virtual GPU (vGPU) to create a new instance type: the Cloud GPU.

When you deploy a Vultr Cloud GPU instance, there's no hassle with driver installation or license issues. You can skip all those steps and run an NVIDIA GPU-powered application in minutes. You can choose a GPU fraction for your workload and budget and then scale that up or down as needed. Cloud GPUs are ideal for a variety of cloud applications like Big Data applications, Virtual Desktop Infrastructure (VDI), Machine Learning (ML), Artificial Intelligence (AI), High-Performance Computing (HPC), video encoding, cloud gaming solutions, general-purpose computing with CUDA, graphics rendering, and more.

Cloud GPUs are Easy

Vultr pre-installs everything you need to get started. Our Cloud GPUs come with licensed NVIDIA drivers and the CUDA Toolkit. If you want a custom operating system that isn't in our library, install cloud-init, and we'll automatically install all those components for you. Then, follow the steps in our Cloud GPU Quickstart, and in a couple of minutes, you'll have a Cloud GPU instance ready to use, with low per-hour billing and no long-term commitments.

Cloud GPUs are Affordable

Dedicated GPUs are expensive and often underutilized, but you can use Cloud GPUs to match your workloads to the processing power you need, saving you time and expense. In addition, Cloud GPUs come in affordable fractions ranging from 1/20th of a card to a fully-dedicated NVIDIA A100 GPU. Our Bare Metal servers also support multiple cards for your most demanding applications.

Cloud GPUs are ideal for training models on smaller subsets of your data and then ramping up to full performance later. For example, you might use a scenario like this:

  1. Deploy a small Cloud GPU instance, then attach a Block Storage volume with your dataset.
  2. Train a neural net on a portion of that data and save the model on Block Storage.
  3. Detach the Block Storage and destroy the small instance.
  4. Deploy a larger Cloud GPU and attach the Block Storage. Then, load the saved model and process the full dataset.

This flexibility is an advantage over dedicated GPUs, which are expensive and often underutilized. You can use Cloud GPUs to match your workloads to the processing power required, saving you time and expense. There is no longer a need to purchase, install, and maintain your GPUs on-premise.

GPU Diagram

About GPU Virtualization

Traditionally, GPU applications require an on-premise server or a cloud server with a fully-dedicated GPU running in passthrough mode. Unfortunately, those solutions cost thousands of dollars per month. Vultr offers an alternative: Cloud GPU instances partitioned into virtual GPUs (vGPUs), which allows you to pick the performance level that matches your workload and budget. vGPUs are powered by NVIDIA AI Enterprise, which presents your server instance with a vGPU that looks just like a physical GPU. Each vGPU has its own dedicated memory slice and a corresponding portion of the physical GPU compute power. vGPUs run all the same frameworks, libraries, and operating systems as a physical GPU.

Vultr offers two types of vGPU partitioning: vGPU temporal partitioning and Multi-instance GPU (MIG) spacial partitioning.

  • vGPU temporal partitioning is performed through software to deliver cost-effective GPU resources to applications. It's available on Cloud GPU plans with less than 10 GB of GPU RAM.
  • Cloud GPU plans with 10 GB GPU RAM and greater use MIG spacial partitioning to fully isolate the high bandwidth memory cache and vGPU cores. This enables multiple GPU instances to run in parallel on a single, physical NVIDIA A100 GPU.

Types of GPU Servers

Vultr offers both Cloud GPU and Bare Metal options.

  • Cloud GPU servers are optimized with dedicated vCPU and vGPU resources. These are ideal for high-performance computing applications, such as video encoding, cloud gaming, and graphics rendering.
  • Bare Metal servers with GPU are single-tenant dedicated hardware for applications with the most demanding performance and security requirements.

Our Cloud GPUs come in a range of memory and compute configurations.

Vultr Cloud GPU Models

We offer a choice of GPU models with different characteristics.

NVIDIA A40

NVIDIA A40 accelerates your most demanding visual computing workloads with the latest Ampere architecture RT Cores, Tensor Cores, and CUDA Cores. These are ideal for VDI, video encoding, cloud gaming solutions, CUDA workloads, ML, and AI. Our A40 Cloud GPU servers have next-generation NVIDIA RTX technology for your most advanced professional visualization workloads.

Our A40-equipped servers have:

  • Second-generation RT Cores
  • Third-generation Tensor Cores
  • Ampere architecture CUDA Cores

For VDI, rendering, visualization, and cloud gaming applications, we recommend our A40-equipped servers.

NVIDIA A100

NVIDIA A100 is the flagship platform for deep learning, HPC, data analytics, and HPC that accelerates over 700 HPC applications and every deep learning framework, including TensorFlow, PyTorch, MxNet, Theano, and Apache Spark. Our A100-equipped servers have third-generation Tensor Cores and Ampere Architecture CUDA Cores. Because the A100 is designed to power AI and HPC compute workloads, it does not include RT Cores for ray tracing acceleration and isn't designed for VDI applications.

Use Cases

Vultr Cloud GPU servers are ideal for anyone who needs to deploy and scale GPU-powered applications quickly. Here is a quick overview of some popular reasons to use a Vultr Cloud GPU server.

Machine Learning and Artificial Intelligence

Machine Learning (ML) is a powerful method for building prediction algorithms from large datasets. ML prediction algorithms drive social media, online shops, search engines, streaming services, and information security applications.

Artificial Intelligence (AI) enables intelligent systems to mimic cognitive functions like decision-making and speech recognition. AI uses large training datasets to learn how to achieve a specific goal.

A sub-category of AI and ML is Computer Vision, which trains convolutional neural networks (CNNs) to recognize objects in images. Computer vision is useful for process automation, health care, self-driving cars, image classification, and more.

Cloud GPUs accelerate ML, AI, and Computer Vision tasks with frameworks like:

  • TensorFlow: An open-source machine learning framework and deep learning library developed by Google.
  • PyTorch: An open-source machine learning library for Python developed by Facebook's AI Research lab.
  • Theano: A high-performance Python library for tensor computation, designed to be easy to write and understand.
  • Apache MXNet: An open-source deep learning framework that features an easy-to-use development interface.
  • MATLAB: With just a few lines of MATLAB code, you can incorporate deep learning into your applications.

Big Data

Big Data extracts meaningful insights from large, complex datasets. Big data is characterized by the "three Vs": Volume, Velocity, and Variety. Volume is usually measured in terabytes, petabytes, or even exabytes. The data is created, read, moved, and analyzed at high velocity, such as on social media platforms. And it consists of a variety of data formats such as photos, video, audio, and complex documents.

GPUs help process big data with systems like:

  • Hadoop: A parallel processing application for large data sets distributed across multiple nodes.
  • Apache Spark: A fast multi-language general processing engine compatible with Hadoop data.
  • Apache Storm: A real-time computation system designed for unbounded data streams.

Video Encoding

GPUs can accelerate resource-intensive video encoding, which converts an original video source format to another format suitable for other devices, bitrates, or tools. For example, you can use GPUs with a tool like FFmpeg to convert raw video to H.264 format, which is the most widely used video format for the web.

General Purpose Computing with CUDA

CUDA (Compute Unified Device Architecture) is an API designed to use a GPU for general-purpose computing, with languages like C, C++, and Fortran. To learn more about how to use CUDA with a Vultr Cloud GPU, please see these resources at the NVIDIA Developer Zone:

Graphics Processing and VDI

Graphics Processing is a traditional use case for GPUs. Video games, virtual reality, geospatial applications like ArcGIS, and even modern desktop environments like Windows, Mac, and Linux require GPU support for acceptable performance.

Ray Tracing is a technique for simulating the effects of light and light-emitting devices on a scene. This computationally intensive process simulates scene lighting, reflections, refractions, shadows, and indirect lighting. It's possible to perform ray tracing in real-time with GPU acceleration. Our A40-equipped servers with second-generation RT Cores and third-generation Tensor Cores are ideal for ray tracing.

Frequently Asked Questions

How do I get started?

See the Vultr Cloud GPU Quickstart to get started.

How are vGPUs allocated?

When you deploy a Cloud GPU, you can pick the most cost-effective vGPU configuration for your application. Select a size ranging from 1/20th of a physical GPU and 4 GB RAM up to the full GPU card and 80 GB RAM.

Can I use more than one GPU?

Yes! We have Bare Metal plans with up to 4 GPUs. Contact our sales team to deploy a Bare Metal server with more than one GPU.

Are vGPUs shared?

No. Vultr dedicates each vGPU core to your cloud server; they are not shared with other customers. Your instance never waits for other processes. Your workloads can run at peak speed and efficiency with 100% of your vGPU — all day, every day.

Do I need to install drivers?

No, there's no need to install drivers. Vultr pre-installs everything you need to get started. Our Cloud GPUs come with licensed NVIDIA drivers and the CUDA Toolkit. If you want a custom operating system that isn't in our library, install cloud-init, and we'll automatically install all the drivers and libraries you need through cloud-init vendor-data.

How do I reinstall or upgrade PyTorch?

PyTorch supports several different installation methods, which you'll find documented on their website. We recommend using Anaconda to install PyTorch in most cases. To get started with PyTorch, you can deploy our ready-to-run Anaconda and Miniconda Marketplace apps.

What CUDA versions does Vultr support?

Vultr Cloud GPUs are compatible with CUDA version 11.3 or later.

How can I verify the GPU drivers are installed?

See if the drivers loaded:

$ sudo lsmod | grep nvidia2modprobe nvidia

The two most common reasons they may not load are:

  • The installer did not run
  • A kernel update ran after the driver installation finished.

Installer did not run

To find out if the installer ran during deployment, test if this file exists:

$ sudo ls /opt/nvidia/drivers/linux_nvidia_client.run

If the file does not exist, please open a support ticket.

Fix a conflicting kernel update

If /opt/nvidia/drivers/linux_nvidia_client.run exists and the drivers aren't loaded, then it's possible a kernel update is blocking the drivers. Try re-running the installer to resolve the conflict. After this completes, the NVIDIA drivers should recognize the new kernel and load correctly.

$ sudo bash /opt/nvidia/drivers/linux_nvidia_client.run --ui=none --no-questions

Why would the GPU stop working?

The GPU will not run if the license file is missing or corrupt. The server installed the license file during deployment with cloud-init. If your GPU isn't working, verify the license file exists:

$ sudo ls /etc/nvidia/gridd.conf

If the file is missing, or you believe it is corrupt, please open a support ticket.