How to Provision Vultr Serverless Inference

Updated on September 23, 2024

Vultr Serverless Inference is an efficient AI model hosting service that provides seamless scalability and reduced operational complexity for Generative AI applications. With reliable performance across six continents, Vultr ensures minimal latency for AI models while meeting stringent security and data compliance requirements.

Follow this guide to provision a Vultr Serverless Inference subscription using the Vultr Customer Portal, API, or CLI.

  • Vultr Customer Portal
  • Vultr API
  • Vultr CLI
  1. Navigate to Products, click Serverless, and then click Inference.

  2. Click Add Serverless Inference.

  3. Provide a Label, acknowledge the list of supported models and the charges note, and click Add Serverless Inference.

  1. Send a POST request to the Create Serverless Inference endpoint to create a Serverless Inference subscription.

    console
    $ curl "https://api.vultr.com/v2/inference" \
        -X POST \
        -H "Authorization: Bearer ${VULTR_API_KEY}" \
        -H "Content-Type: application/json" \
        --data '{
            "label" : "example-inference"
        }'
    
  2. Send a GET request to the List Serverless Inference endpoint to list all the available Serverless Inference subscriptions.

    console
    $ curl "https://api.vultr.com/v2/inference" \
        -X GET \
        -H "Authorization: Bearer ${VULTR_API_KEY}"
    
  1. Create a Serverless Inference subscription.

    console
    $ vultr-cli inference create --label example-inference
    
  2. List all the available Serverless Inference subscriptions available.

    console
    $ vultr-cli inference list
    

Comments

No comments yet.