Vultr Serverless Inference is an efficient AI model hosting service that provides seamless scalability and reduced operational complexity for Generative AI applications. With reliable performance across six continents, Vultr ensures minimal latency for AI models while meeting stringent security and data compliance requirements.
Follow this guide to provision a Vultr Serverless Inference subscription using the Vultr Customer Portal, API, or CLI.
Navigate to Products, click Serverless, and then click Inference.
Click Add Serverless Inference.
Provide a Label, acknowledge the list of supported models and the charges note, and click Add Serverless Inference.
Send a POST
request to the Create Serverless Inference endpoint to create a Serverless Inference subscription.
$ curl "https://api.vultr.com/v2/inference" \
-X POST \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data '{
"label" : "example-inference"
}'
Send a GET
request to the List Serverless Inference endpoint to list all the available Serverless Inference subscriptions.
$ curl "https://api.vultr.com/v2/inference" \
-X GET \
-H "Authorization: Bearer ${VULTR_API_KEY}"
Create a Serverless Inference subscription.
$ vultr-cli inference create --label example-inference
List all the available Serverless Inference subscriptions available.
$ vultr-cli inference list
No comments yet.