How to Use Vultr Cloud Inference in Node.js

Updated on April 22, 2024
How to Use Vultr Cloud Inference in Node.js header image

Introduction

Vultr Cloud Inference allows you to run inference workloads for large language models such as Mixtral 8x7B, Mistral 7B, Meta Llama 2 70B, and more. Using Vultr Cloud Inference, you can run inference workloads without having to worry about the infrastructure, and you only pay for the input and output tokens.

This article demonstrates step-by-step process to start using Vultr Cloud Inference in Node.js.

Prerequisites

Before you begin, you must:

Set Up the Environment

Create a new project directory and navigate to the project directory.

console
$ mkdir vultr-cloud-inference-nodejs
$ cd vultr-cloud-inference-nodejs

Create a new Node.js project.

console
$ npm init -y

Install the required Node.js packages.

console
$ npm install openai
Note
Please note that you only need to install the openai package if you are using the OpenAI SDK for Vultr Cloud Inference.

Inference via Direct API Calls

Vultr Cloud Inference provides a RESTful API to run inference workloads. You can use the fetch built-in module to make the API calls.

Create a new JavaScript file name inference.js.

console
$ nano inference.js

Add the following code to inference.js.

javascript
const apiKey = process.env.VULTR_CLOUD_INFERENCE_API_KEY;

// Set the model
// List of available models: https://api.vultrinference.com/v1/chat/models
const model = '';
const messages = [
    {
        "role": "user",
        "content": "What is the capital of India?"
    }
]

const headers = {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json'
}

const url = 'https://api.vultrinference.com/v1/chat/completions';
const options = {
  method: 'POST',
  headers,
  body: JSON.stringify({model, messages})
};

fetch(url, options)
    .then(output => output.json())
    .then(output => {
        const llmResponse = output.choices[0].message.content;
        console.log(llmResponse);
    })

Run the JavaScript script.

console
$ export VULTR_CLOUD_INFERENCE_API_KEY=<your_api_key>
$ node inference.js

Here, we are making a POST request to https://api.vultrinference.com/v1/chat/completions with the required headers and data. The messages list contains the list of messages for which we want to generate completions, role can be either system, user or assistant, and content is the message content.

To maintain conversation context, you can add the previous messages to the messages list. You can also use the stream parameter to get real-time completions. For more information, refer to the Vultr Cloud Inference API documentation.

Inferencing via OpenAI SDK

If you are using the OpenAI SDK for Vultr Cloud Inference, you can use the openai package to make the API calls.

Create a new JavaScript file name inference_openai.js.

console
$ nano inference_openai.js

Add the following code to inference_openai.js.

javascript
const { OpenAI } = require("openai");

const apiKey = process.env.VULTR_CLOUD_INFERENCE_API_KEY
const client = new OpenAI({
    apiKey,
    baseURL: "https://api.vultrinference.com/v1"
});

// Set the model
// List of available models: https://api.vultrinference.com/v1/chat/models
const model = '';
const messages = [
    {
        "role": "user",
        "content": "What is the capital of India?"
    }
]

async function main() {
    const output = await client.chat.completions.create({ messages, model});
    const llmResponse = output.choices[0].message.content;

    console.log(llmResponse);
}

main();

Run the JavaScript script.

console
$ export VULTR_CLOUD_INFERENCE_API_KEY=<your_api_key>
$ node inference_openai.js

Here, we are using the openai package to make the API calls. The messages list contains the list of messages for which we want to generate completions, role can be either system, user or assistant, and content is the message content.

Conclusion

In this article, you learned how to use Vultr Cloud Inference in Node.js. You also learned how to make direct API calls and use the OpenAI SDK for Vultr Cloud Inference. You can now integrate Vultr Cloud Inference into your Node.js applications to generate completions for large language models.