How to Build a vLLM Container Image

Introduction

vLLM is a fast inference and serving library for Large Language Models (LLMs). It offers several functionalities such as integration with popular hugging face models, optimized CUDA kernels for NVIDIA GPUs, tensor parallelism support and fast model execution.

This article explains how to build a vLLM container image using the Vultr Container Registry.

Prerequisites

Before you begin:

Deploy an instance using Vultr's GPU Marketplace App
Access the server using SSH.
Start the Docker service.
console
```
$ sudo systemctl start docker
```

Set Up the Server

Create a new directory to store your vLLM project files.
console
```
$ mkdir vllm-project
```
Switch to the directory.
console
```
$ cd vllm-project
```
Clone the vLLM project repository using Git.
console
```
$ git clone https://github.com/vllm-project/vllm/
```
List files and verify that a new vllm directory is available.
console
```
$ ls
```
Switch to the vllm project directory.
console
```
$ cd vllm
```
List the directory files and verify that the necessary Dockerfile resources are available.
console
```
$ ls
```
Output:
```
benchmarks      collect_env.py   Dockerfile       docs       LICENSE                 pyproject.toml          requirements-common.txt  requirements-dev.txt     rocm_patch  vllm
cmake           CONTRIBUTING.md  Dockerfile.cpu   examples   MANIFEST.in             README.md               requirements-cpu.txt     requirements-neuron.txt  setup.py
CMakeLists.txt  csrc             Dockerfile.rocm 
```
The vLLM project directory includes the following Dockerfile resources:
- Dockerfile: Contains the main vLLM library build context with support for NVIDIA GPU systems.
- Dockerfile.cpu: Contains the vLLM build context for CPU systems.
- Dockerfile.rocm: Contains the build context for AMD GPU systems.
Use the above resources in the next sections to build a CPU or GPU system container image.

Build a vLLM Container Image for CPU Systems

Follow the steps below to build a new vLLM container image using Dockerfile.cpu that contains the build context with all necessary packages and dependencies for CPU-based systems.

Build a new container image using Dockerfile.cpu with all files in the project working directory. Replace vllm-image with your desired image name.
console
```
$ docker build -f Dockerfile.cpu -t vllm-image .
```

View all Docker images on the server and verify that your new vLLM image is available.

console

$ docker images

Output:

REPOSITORY   TAG       IMAGE ID       CREATED              SIZE                                                                                                                                                                     
vllm-image   latest    70f07e7c923f   About a minute ago   3.22GB

Build a vLLM Container Image for GPU Systems

The vLLM project directory contains two Dockerfile resources for building container images for GPU-powered systems. Follow the steps below to use the main Dockerfile resource to build a new container image for GPU systems.

Build a new container image vllm-gpu-image using Dockerfile with all files in the project directory.
console
```
$ docker build -f Dockerfile -t vllm-gpu-image .
```
View all Docker images on the server and verify that the new vllm-gpu-image is available.
console
```
$ docker images
```
Output:
```
REPOSITORY        TAG       IMAGE ID       CREATED       SIZE
vllm-gpu-image   latest    bf92416d18b4   8 hours ago   8.88GB
```
To run the vLLM GPU container image, verify that your target host runs the minimum or higher CUDA version referenced in the Dockerfile and use the --gpus all option when starting the container. Run the following command to verify the minimum CUDA version.
console
```
$ cat Dockerfile | grep CUDA_VERSION=
```
Output:
```
ARG CUDA_VERSION=12.3.1
```

Upload the vLLM Container Image to the Vultr Container Registry

Open the Vultr Customer Portal.
Click Products and select Container Registry on the main navigation menu.
Click your target Vultr Container Registry to open the management panel and view the registry access credentials.
Copy the Registry URL value, Username, and API Key to use when accessing the registry.
Switch to your server terminal session and log in to your Vultr Container Registry. Replace exampleregistry, exampleuser, registry-password with your actual registry details.
console
```
$ docker login https://sjc.vultrcr.com/exampleregistry -u exampleuser -p registry-password
```
Tag the vLLM container image with your desired Vultr Container Registry tag. For example, sjc.vultrcr.com/exampleregistry/vllm-gpu-image.
console
```
$ docker tag vllm-gpu-image sjc.vultrcr.com/exampleregistry/vllm-gpu-image
```

View all Docker images on the server and verify that the new tagged image is available.

console

$ docker images

Output:

REPOSITORY                                       TAG       IMAGE ID       CREATED       SIZE
vllm--gpu-image                                  latest    bf92416d18b4   8 hours ago   8.88GB
sjc.vultrcr.com/exampleregistry/vllm-gpu-image   latest    bf92416d18b4   8 hours ago   8.88GB

Push the tagged image to your Vultr Container Registry.
console
```
$ docker push sjc.vultrcr.com/exampleregistry/vllm-gpu-image
```
Open your Vultr Container Registry management panel and click Repositories on the top navigation bar to verify that the new repository is available.

Tags:

vLLM Container Image

Vultr Container Registry

GPU Systems

Comments

No comments yet.

How to Build a vLLM Container Image

Introduction

Prerequisites

Set Up the Server

Build a vLLM Container Image for CPU Systems

Build a vLLM Container Image for GPU Systems

Upload the vLLM Container Image to the Vultr Container Registry

Comments

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Blogs