How to Provision Vultr Serverless Inference

Updated on 10 September, 2025

A process that prepares and configures a server or service for use after initial deployment.

Vultr Serverless Inference is an efficient AI model hosting service that provides seamless scalability and reduced operational complexity for Generative AI applications. With reliable performance across six continents, Vultr ensures minimal latency for AI models while meeting stringent security and data compliance requirements.

Follow this guide to provision a Vultr Serverless Inference subscription using the Vultr Customer Portal, API, or CLI.

Vultr Customer Portal
Vultr API
Vultr CLI

Navigate to Products, click Serverless, and then click Inference.
Click Add Serverless Inference.
Provide a Label, acknowledge the list of supported models and the charges note, and click Add Serverless Inference.

Send a POST request to the Create Serverless Inference endpoint to create a Serverless Inference subscription.

                            console
                            
                        
$ curl "https://api.vultr.com/v2/inference" \
    -X POST \
    -H "Authorization: Bearer ${VULTR_API_KEY}" \
    -H "Content-Type: application/json" \
    --data '{
        "label" : "example-inference"
    }'

Send a GET request to the List Serverless Inference endpoint to list all the available Serverless Inference subscriptions.
console
```
$ curl "https://api.vultr.com/v2/inference" \
    -X GET \
    -H "Authorization: Bearer ${VULTR_API_KEY}"
```

Create a Serverless Inference subscription.
console
```
$ vultr-cli inference create --label example-inference
```
List all the available Serverless Inference subscriptions available.
console
```
$ vultr-cli inference list
```

How to Provision Vultr Serverless Inference

Comments

Products

Features

Solutions

Marketplace

Resources

Company