How to Provision Vultr Serverless Inference

Updated on November 27, 2024

Vultr Serverless Inference is an efficient AI model hosting service that provides seamless scalability and reduced operational complexity for Generative AI applications. It supports a wide range of use cases, such as fraud detection in financial services, patient care and monitoring in healthcare, and real-time data analysis in retail. With reliable performance across six continents, Vultr ensures minimal latency for AI models while meeting stringent security and data compliance requirements.

Follow this guide to provision a Vultr Serverless Inference instance using the Vultr Customer Portal, API, or CLI.

  • Vultr Customer Portal
  • Vultr API
  • Vultr CLI
  1. Navigate to Products, click Serverless, and then click Inference.

    Serverless Inference option in products menu

  2. Click Add Serverless Inference.

    Add serverless inference button

  3. Provide a Label, acknowledge the charges note and click Add Serverless Inference.

    Button for serverless inference creation

  1. Send a POST request to the Create Inference endpoint to create a Serverless Inference service.

    console
    $ curl "https://api.vultr.com/v2/inference" \
        -X POST \
        -H "Authorization: Bearer ${VULTR_API_KEY}" \
        -H "Content-Type: application/json" \
        --data '{
            "label" : "example-inference"
        }'
    
  2. Send a GET request to the List Inference endpoint to list all the available Serverless Inference services.

    console
    $ curl "https://api.vultr.com/v2/inference" \
        -X GET \
        -H "Authorization: Bearer ${VULTR_API_KEY}"
    
  1. Create a Serverless Inference service.

    console
    $ vultr-cli inference create --label example-service
    
  2. List all the available Serverless Inference services available.

    console
    $ vultr-cli inference list