Find Servers That Handled Most Number of Requests

Updated on 28 May, 2025
Find Servers That Handled Most Number of Requests header image

Problem Statement

In a system with k servers indexed from 0 to k-1, each server handles requests but can only process one request at any given time. Requests are received at times specified in the arrival array and each request has a specific load time that it takes to complete. If at the arrival of a request, the designated server (as determined by computing the index i % k where i is the index of the arrival) is busy, then the next available server is searched in a circular manner starting from the server next to designated one. If no servers are available, the request is dropped.

This process is repeated for each incoming request as specified in the strictly increasing arrival array. The primary goal is to determine the server(s) that handle the highest number of requests, these servers are termed as the "busiest servers".

The answer should be a list of indices of these busiest servers, in any order.

Examples

Example 1

Input:

k = 3, arrival = [1,2,3,4,5], load = [5,2,3,3,3]

Output:

[1]

Explanation:

All of the servers start out available.
The first 3 requests are handled by the first 3 servers in order.
Request 3 comes in. Server 0 is busy, so it's assigned to the next available server, which is 1.
Request 4 comes in. It cannot be handled since all servers are busy, so it is dropped.
Servers 0 and 2 handled one request each, while server 1 handled two requests. Hence server 1 is the busiest server.

Example 2

Input:

k = 3, arrival = [1,2,3,4], load = [1,2,1,2]

Output:

[0]

Explanation:

The first 3 requests are handled by first 3 servers.
Request 3 comes in. It is handled by server 0 since the server is available.
Server 0 handled two requests, while servers 1 and 2 handled one request each. Hence server 0 is the busiest server.

Example 3

Input:

k = 3, arrival = [1,2,3], load = [10,12,11]

Output:

[0,1,2]

Explanation:

Each server handles a single request, so they are all considered the busiest.

Constraints

  • 1 <= k <= 105
  • 1 <= arrival.length, load.length <= 105
  • arrival.length == load.length
  • 1 <= arrival[i], load[i] <= 109
  • arrival is strictly increasing.

Approach and Intuition

The problem essentially revolves around efficiently managing and monitoring the availability of servers over the course of multiple request arrivals and process completions.

  1. Maintain an array, next_free_time, of size k which tracks the next free time for each server. Initially, all values in this array can be set to zero, indicating that all servers are free from the start.

  2. Use a data structure, such as a priority queue, to ensure that you always have quick access to the earliest freeing server, which can be crucial when all preferred servers are busy, hence providing a way to wrap around and check servers efficiently.

  3. Use an array, requests_handled, of size k to count the requests handled by each server.

  4. For each arrival[i]:

    • Determine the preferred server using the modulo operation i % k.
    • Check if the preferred server is free. If not, check the next server in a circular manner using the priority queue to efficiently find the next available server.
    • If a server is found:
      • Update its entry in next_free_time to the current arrival[i] plus load[i] indicating when it will be free next.
      • Increment its count in requests_handled.
    • If no server is free, the request is dropped.
  5. At the end of all arrivals, determine the maximum value in requests_handled. Any server index that matches this count is part of the busiest servers.

This method ensures an efficient handling and assignment of requests while keeping track of the server utilization. The use of the priority queue minimizes the overhead of finding the next available server, thus handling the constraints of up to 10^5 servers and requests efficiently.

Solutions

  • C++
  • Java
cpp
class Solution {
public:
    vector<int> matchLoad(int k, vector<int>& arrival, vector<int>& load) {
        vector<int> serverCount(k, 0);
        priority_queue<int, vector<int>, greater<int>> availServers;
        priority_queue<pair<int, int>, vector<pair<int, int>>, greater<>> occupiedServers;

        for (int i = 0; i < k; ++i) {
            availServers.push(i);
        }

        for (int i = 0; i < arrival.size(); ++i) {
            int startTime = arrival[i];

            while (!occupiedServers.empty() && occupiedServers.top().first <= startTime) {
                auto [finishTime, serverID] = occupiedServers.top();
                occupiedServers.pop();
                int newID = ((serverID - i) % k + k) % k + i;
                availServers.push(newID);
            }

            if (!availServers.empty()) {
                int assignedID = availServers.top() % k;
                availServers.pop();
                occupiedServers.push({startTime + load[i], assignedID});
                serverCount[assignedID]++;
            }
        }

        vector<int> results;
        int maxLoad = *max_element(serverCount.begin(), serverCount.end());
        for (int i = 0; i < k; ++i) {
            if (serverCount[i] == maxLoad) {
                results.push_back(i);
            }
        }
        
        return results;
    }
};

To identify the servers that processed the most requests, the given C++ solution performs a simulation using priority queues to efficiently manage server availability and task durations.

  • The function matchLoad takes three parameters:

    • k denotes the total number of servers.
    • arrival is a vector where each element represents the time a request arrives.
    • load is a vector where each corresponding element represents the duration of the request initiated at the arrival time.
  • Maintain two main data structures:

    • serverCount, a vector that tracks the number of requests each server handles.
    • availServers, a priority queue (min-heap) that stores the indices of available servers.
    • occupiedServers, a priority queue (min-heap) that tracks servers currently occupied by requests. It pairs the finish time of a request with the server's index.
  • Initialize availServers by pushing all server indices into it, signifying that all servers are initially available.

  • Iteratively process each request from the arrival vector:

    1. For each incoming request, first release any servers that have completed their tasks by the current request's start time from occupiedServers. Re-adjust and return these servers to availServers.
    2. Assign a free server from availServers to the incoming request. Update occupiedServers to include the server with its new finish time (current time + load).
    3. Increment the server's count in serverCount, indicating it has handled another request.
  • After processing all requests:

    • Determine the maximum value in serverCount to identify the servers that handled the most requests.
    • Collect and return the indices of all servers that handled this maximum number of requests.

By using priority queues, the algorithm efficiently manages the servers and queues, scales well with the number of requests, and ensures all servers are utilized optimally based on their availability and load conditions.

java
class Solution {
    public List<Integer> getBusiestServers(int numServers, int[] arrivalTimes, int[] workLoads) {
        int[] requestCounts = new int[numServers];
        PriorityQueue<Integer> available = new PriorityQueue<>((x, y) -> x - y);
        PriorityQueue<Pair<Integer, Integer>> inUse = new PriorityQueue<>((x, y) -> x.getKey() - y.getKey());

        // Initialize all servers as available.
        for (int i = 0; i < numServers; ++i) {
            available.add(i);
        }
        
        for (int i = 0; i < arrivalTimes.length; ++i) {
            int currentTime = arrivalTimes[i];

            // Release and reassign servers based on availability and current time
            while (!inUse.isEmpty() && inUse.peek().getKey() <= currentTime) {
                Pair<Integer, Integer> current = inUse.poll();
                int serverIdx = current.getValue();
                int updatedIdx = ((serverIdx - i) % numServers + numServers) % numServers + i;
                available.add(updatedIdx);
            }

            // Assign server to handle current request
            if (!available.isEmpty()) {
                int nextServer = available.poll() % numServers;
                inUse.add(new Pair<>(currentTime + workLoads[i], nextServer));
                requestCounts[nextServer]++;
            }
        }
        
        // Determine servers with the highest number of requests handled
        int maxRequests = Collections.max(Arrays.stream(requestCounts).boxed().collect(Collectors.toList()));
        List<Integer> busiest = new ArrayList<>();
        for (int i = 0; i < numServers; ++i) {
            if (requestCounts[i] == maxRequests) {
                busiest.add(i);
            }
        }
        
        return busiest;
    }
}

In this Java solution, the goal is to identify the server(s) that handled the maximum number of requests among a given set of servers. The solution employs two primary data structures, each serving different purposes to efficiently manage server workload distribution and tracking:

  • A PriorityQueue named available keeps track of servers that are immediately available to handle incoming requests. Servers are prioritized for assignment based on their numerical order, ensuring fair distribution of requests.

  • Another PriorityQueue named inUse tracks servers that are currently busy handling requests. It pairs the time a server will be free with the server's index, allowing the system to manage server release and reassignment based on real-time availability.

The procedure performs the following steps:

  1. Initializes all servers as available by adding each server's index to the available queue.

  2. Iterates over each request's arrival time. For each time step:

    • Releases servers whose workload has been completed up to the current time. Reassigns their indices to reflect the circular nature of server assignment (using modular arithmetic).

    • Allocates the next available server to handle the incoming request. The server is then marked busy until it finishes handling its assigned workload, achieved by inserting its future availability time into the inUse queue.

    • Tracks the number of requests handled by each server using an array requestCounts, incrementing the count for the server handling the current request.

  3. After all requests have been processed, the method checks requestCounts to identify the maximum number of requests handled by any server.

  4. Compiles a list of all servers that achieved this maximum request count, thus identifying the busiest servers.

This method allows efficient distribution and tracking of workload across servers, utilizing priority queues to minimize idle time and ensure prompt response to incoming requests. The result is a list of server indices that represent the most loaded servers in terms of number of requests handled.

Comments

No comments yet.