How to Use the Chat Endpoint in Vultr Serverless Inference

Updated on September 23, 2024

Vultr Serverless Inference chat endpoint enables users to engage in chat conversations with Large Language Models (LLMs). This service allows for real-time interaction, leveraging advanced AI capabilities to facilitate dynamic and responsive communication. By integrating this endpoint, users can enhance their applications with sophisticated conversational AI, improving user experience and operational efficiency.

Follow this guide to utilize the chat endpoint on your Vultr account using the Vultr Customer Portal or API.

  • Vultr Customer Portal
  • Vultr API
  1. Navigate to Products, click Serverless, and then click Inference.

  2. Click your target inference subscription to open its management page.

  3. Open the Chat page.

  4. Select a preferred model.

  5. Provide Max Tokens value.

  6. Send a message in the chat window.

  7. Click History to view chat history.

  8. Click New Conversation to create a chat window.

  1. Send a GET request to the List Serverless Inference endpoint and note the target inference subscription's ID.

    console
    $ curl "https://api.vultr.com/v2/inference" \
        -X GET \
        -H "Authorization: Bearer ${VULTR_API_KEY}"
    
  2. Send a GET request to the Serverless Inference endpoint and note the target inference subscription's API key.

    console
    $ curl "https://api.vultr.com/v2/inference/{inference-id}" \
        -X GET \
        -H "Authorization: Bearer ${VULTR_API_KEY}"
    
  3. Send a GET request to the List Models endpoint and note the preferred inference model's ID.

    console
    $ curl "https://api.vultrinference.com/v1/models" \
        -X GET \
        -H "Authorization: Bearer ${INFERENCE_API_KEY}"
    
  4. Send a POST request to the Create Chat Completion endpoint to chat with the prefered Large Language Model.

    console
    $ curl "https://api.vultrinference.com/v1/chat/completions" \
        -X POST \
        -H "Authorization: Bearer ${INFERENCE_API_KEY}" \
        -H "Content-Type: application/json" \
        --data '{
            "model": "{model-id}",
            "messages": [
            {
                "role": "user",
                "content": "{user-input}"
            }
            ],
            "max_tokens": 512
        }'
    

    Visit the Create Chat Completion API page to view additional attributes you can apply for greater control when interacting with the preferred inference model.

Comments

No comments yet.