How to Use the Chat Endpoint in Vultr Serverless Inference

Updated on November 27, 2024

Vultr Serverless Inference chat endpoint enables users to engage in chat conversations with Large Language Models (LLMs). This service allows for real-time interaction, leveraging advanced AI capabilities to facilitate dynamic and responsive communication. By integrating this endpoint, users can enhance their applications with sophisticated conversational AI, improving user experience and operational efficiency.

Follow this guide to utilize the chat endpoint on your Vultr account using the Vultr Customer Portal.

  • Vultr Customer Portal
  1. Navigate to Products, click Serverless, and then click Inference.

    Serverless Inference option in products menu

  2. Click your target inference service to open its management page.

    Selection of a target serverless inference service

  3. Open the Chat page.

    Button to open the chat endpoint page

  4. Select a preferred model.

    Button to select a preferred chat model

  5. Provide Max Tokens value.

    Field to select the max token in the output

  6. Send a message in the chat window.

    Window to send a chat message to the model

  7. Click History to view chat history.

    Button to view the chat history with the model

  8. Click New Conversation to create a chat window.

    Button to start a new chat conversation with the model