Vultr Serverless Inference is an efficient AI model hosting service that provides seamless scalability and reduced operational complexity for Generative AI applications. It supports a wide range of use cases, such as fraud detection in financial services, patient care and monitoring in healthcare, and real-time data analysis in retail. With reliable performance across six continents, Vultr ensures minimal latency for AI models while meeting stringent security and data compliance requirements.
Follow this guide to provision a Vultr Serverless Inference instance using the Vultr Customer Portal, API, or CLI.
Navigate to Products, click Serverless, and then click Inference.
Click Add Serverless Inference.
Provide a Label, acknowledge the charges note and click Add Serverless Inference.
Send a POST
request to the Create Inference endpoint to create a Serverless Inference service.
$ curl "https://api.vultr.com/v2/inference" \
-X POST \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data '{
"label" : "example-inference"
}'
Send a GET
request to the List Inference endpoint to list all the available Serverless Inference services.
$ curl "https://api.vultr.com/v2/inference" \
-X GET \
-H "Authorization: Bearer ${VULTR_API_KEY}"
Create a Serverless Inference service.
$ vultr-cli inference create --label example-service
List all the available Serverless Inference services available.
$ vultr-cli inference list