
Tool calling transforms your AI agents from simple chatbots into powerful assistants that can interact with real-world systems. With tool calling, you can build agents that fetch live weather data, query databases, execute API calls, and deliver accurate, data-driven responses, all through the Vultr Serverless Inference endpoint.
Vultr Serverless Inference removes the complexity of infrastructure management. You don't deploy servers, handle cold starts, or write custom API wrappers. You simply define your tools, connect to Vultr's Serverless Inference endpoint, and start building intelligent, function-aware applications.
Follow this guide to implement tool calling with the kimi-k2-instruct model hosted on Vultr Serverless Inference. You will learn how to define function schemas, send tool-enabled requests, execute functions with real data, and return the results to generate contextual, natural-language responses.
Prerequisites
Before you begin, you need to:
- Have a Vultr Serverless Inference Subscription.
 - Retrieve your API key for the Vultr Serverless Inference endpoint.
 
Make a Tool Calling Request via cURL
In this section, you send a tool-enabled request to the Vultr Serverless Inference API using cURL. This helps you test tool definitions, verify model behavior, and confirm that tool calls function as expected before integrating them into your application.
The example below defines a simple get_horoscope function. You first send a request that triggers a tool call, then send another request that returns the executed result to complete the interaction.
Export your Vultr Inference API key.
console$ export VULTR_INFERENCE_API_KEY=YOUR_VULTR_INFERENCE_API_KEY
Send a user message and define the tool schema.
console$ curl --location "https://api.vultrinference.com/v1/chat/completions" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --header "Authorization: Bearer ${VULTR_INFERENCE_API_KEY}" \ --data '{ "model": "kimi-k2-instruct", "messages": [ { "role": "user", "content": "What is my horoscope? I am an Aquarius." } ], "tools": [ { "type": "function", "function": { "name": "get_horoscope", "description": "Get today'\''s horoscope for an astrological sign.", "parameters": { "type": "object", "properties": { "sign": { "type": "string", "description": "An astrological sign like Taurus or Aquarius" } }, "required": ["sign"] } } } ], "tool_choice": "auto" }'
In the above curl request:
- model: Specifies the model to use for inference. 
kimi-k2-instructsupports tool calling and can invoke defined functions automatically. - messages: Defines the conversation history for the model.
 - messages[0].role: Identifies the speaker. 
"user"indicates the message comes from the end user. - messages[0].content: The user's input message 
"What is my horoscope? I am an Aquarius." - tools: Lists the available tools that the model can call.
 - tools[0].type: Defines the tool type. Must be 
"function"for function-based tools. - tools[0].function.name: Specifies the function name (
"get_horoscope") used when the model issues a tool call. - tools[0].function.description: Briefly explains what the function does helps the model decide when to call it.
 - tools[0].function.parameters.type: Defines the input format for the function. Always 
"object". - tools[0].function.parameters.properties.sign: Describes the expected field name, data type (
"string"), and purpose ("An astrological sign like Taurus or Aquarius"). - tools[0].function.parameters.required: Lists required fields. The 
signfield must be present for this function. - tool_choice: Controls when the model calls tools.
"none": disables tool calls."auto": lets the model decide when to call tools (default)."required": forces at least one tool call.
 
Output:
{ "id": "chatcmpl-44ae85bd4a554bf8a880db70ba2ce521", "model": "kimi-k2-instruct", "choices": [ { "message": { "role": "assistant", "content": "I'll retrieve today's horoscope for Aquarius.", "tool_calls": [ { "id": "functions.get_horoscope:0", "type": "function", "function": { "name": "get_horoscope", "arguments": "{\"sign\": \"Aquarius\"}" } } ] }, "finish_reason": "tool_calls" } ], ... }The response shows that the model successfully recognized the defined tool and generated a structured tool call instead of a final message. It identifies the function to execute (
get_horoscope) and includes the parsed argument ("sign": "Aquarius"). Thefinish_reasonvalue"tool_calls"indicates that the model has paused its response, waiting for your application to run the specified function and return the result in a follow-up request before completing the conversation.- model: Specifies the model to use for inference. 
 Send a response back with a static value to demonstrate a function call.
console$ curl --location "https://api.vultrinference.com/v1/chat/completions" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --header "Authorization: Bearer ${VULTR_INFERENCE_API_KEY}" \ --data '{ "model": "kimi-k2-instruct", "messages": [ { "role": "user", "content": "What is my horoscope? I am an Aquarius." }, { "role": "assistant", "content": null, "tool_calls": [ { "id": "call_123", "type": "function", "function": { "name": "get_horoscope", "arguments": "{\"sign\": \"Aquarius\"}" } } ] }, { "role": "tool", "tool_call_id": "call_123", "content": "{\"horoscope\": \"Aquarius: Next Tuesday you will befriend a baby otter.\"}" } ], "tools": [ { "type": "function", "function": { "name": "get_horoscope", "description": "Get today'\''s horoscope for an astrological sign.", "parameters": { "type": "object", "properties": { "sign": { "type": "string", "description": "An astrological sign like Taurus or Aquarius" } }, "required": ["sign"] } } } ], "tool_choice": "auto" }'
In the above request:
- messages[1].role: Specifies 
"assistant", representing the model's tool call request. This message includes the function name (get_horoscope) and its arguments. - messages[1].tool_calls: Contains the tool call generated by the model in the previous step. The 
"id"uniquely identifies this call for linking the response. - messages[2].role: Set to 
"tool", indicating that this message contains the function's execution result. - messages[2].tool_call_id: Matches the 
"id"from the assistant's tool call, ensuring the model correctly associates the response with the right tool call. - messages[2].content: Includes the tool's output data here, a static JSON value representing the horoscope.
 - tools: Must still include the same function definition, even in the second call, so the model understands the schema of the response.
 - tool_choice: Remains 
"auto", allowing the model to decide whether additional tool calls are required or to finalize the conversation. 
This request defines a static tool response to simulate a completed function call. The model interprets the returned data and generates a natural-language reply, even though no actual function execution occurs.NoteOutput:
{ "id": "chatcmpl-fc6dac373cf048999bdac6de945594d0", "model": "kimi-k2-instruct", "choices": [ { "message": { "role": "assistant", "content": "Your Aquarius horoscope says: *Next Tuesday you will befriend a baby otter.* Sounds like a charming week ahead!" }, "finish_reason": "stop" } ], ... }The model receives the tool output, integrates it into its reasoning, and produces a complete natural-language response. The
finish_reasonvalue"stop"indicates that the model has successfully completed the conversation without requiring further tool calls.- messages[1].role: Specifies 
 
Make a Tool Calling Request via Python
In this section, you perform the same tool-calling workflow using Python. This approach is ideal for integrating tool calling into real-world applications, where the model’s tool requests are handled programmatically. This example script demonstrates how to define a local function, detect a tool call from the model, execute the function, and send the result back to the model for a final, natural-language response.
Download the sample script.
console$ curl -O https://raw.githubusercontent.com/vultr-marketing/code-samples/refs/heads/main/vultr-inference-examples/tool-calling-weather.py
The script performs the following steps:
- Defines a tool function — 
get_weather(city)fetches real weather data for a given city. - Sends an initial message — prompts the model with a natural query (for example, 
"What's the weather like in New Delhi?"). - Detects tool calls — checks whether the model requests a specific function call.
 - Executes the function — retrieves real data and prepares it as structured JSON.
 - Sends the result back — posts a second request with the tool result for a conversational reply.
 
- Defines a tool function — 
 Create a Python environment and activate it.
console$ python3 -m venv env && source env/bin/activate
Install the required dependencies.
console$ pip install requests python-dotenv
Export your Vultr Inference API key as an environment variable.
console$ export VULTR_INFERENCE_API_KEY=YOUR_VULTR_INFERENCE_API_KEY
Execute the Python script.
console$ python3 tool-calling-weather.py
When you run the script, it prompts you to enter a city name.
Output:
Vultr Serverless Inference: Tool Calling Demo Enter a city name: Tokyo Sending initial request to model... Model requested function: get_weather({'city': 'Tokyo'}) Tool result: {'city': 'Tokyo', 'temperature': '9.1°C', 'windspeed': '4.7 km/h', 'winddirection': '351°', 'time': '2025-11-03T20:30'} Sending tool result back to model... Model's Final Response: Right now in Tokyo (as of 8:30 p.m. local time, 3 Nov 2025): • Temperature: 9.1 °C (about 48 °F) – a cool evening, so have a light jacket if you’re stepping out. • Wind: Light, 4.7 km/h from the north-northwest (351°).The script automatically retrieves real-time weather data for the entered city, processes it through the model using Vultr Serverless Inference, and returns a conversational natural-language response describing the weather conditions.
Conclusion
You have successfully implemented tool calling using Vultr Serverless Inference with the kimi-k2-instruct model. You defined custom tool schemas, sent requests that triggered model-initiated function calls, and returned structured outputs for contextual responses. Using both cURL and Python, you simulated and executed real function calls including fetching live weather data.