Storage Performance for Vultr Object Storage

Updated on 01 April, 2026

Comprehensive benchmarking and performance analysis of Vultr Object Storage tiers, rate limits, workloads, and Warp-based replication methodology.


We have conducted extensive benchmarking of Vultr Object Storage across all performance tiers. For additional context on the performance metrics referenced throughout this document and to better understand how storage performance is measured, benchmarked, and compared, see What Are the Fundamentals of Storage Performance?.

Vultr Object Storage is available in four performance tiers: Standard, Premium, Performance, and Accelerated.

The Standard and Premium tiers are primarily implemented on HDD storage with flash acceleration for metadata and write-ahead caching. This architecture improves metadata responsiveness and write performance while maintaining cost efficiency.

The Performance and Accelerated tiers are based on NVMe storage. These tiers are optimized for workloads running on Vultr Cloud Compute and GPU instances within the same data center and provide significantly higher throughput and operations per second than typical Internet-facing object storage.

Rate Limits

Tiers of Vultr Object Storage are limited at the object gateway to not allow a given subscription to exceed certain operations per second and throughput levels. These limits are imposed to avoid situations where one object storage subscription consumes all the available network throughput or processing power available on the object gateway for its storage workload, limiting what would be available for competing workloads.

Tier Operations Limit Throughput Limit
Standard 800 ops per second 600 MiB/s or 5.0 Gbps
Premium 1,000 ops per second 800 MiB/s or 6.7 Gbps
Performance 4,000 ops per second 1,000 MiB/s or 8.3 Gbps
Accelerated 10,000 ops per second 5,000 MiB/s or 41.9 Gbps

It is important to understand how these limits interact with object sizes. For instance, object sizes smaller than 524,288 bytes will not reach the Accelerated tier’s throughput limit before reaching its operations limit.

Note also that metadata operations count toward the total operations per second. This means that they subtract from the available operations for reads and writes and so either reduce throughput for the remaining operations or require larger object sizes to make that throughput limit possible to reach.

Benchmark Results

As a rate limited service, both for operations per second and throughput, it is possible to reach one limit before reaching the other. Small object sizes will reach operations thresholds before throughput thresholds and large object sizes will reach throughput thresholds before operations thresholds. Moreover, metadata operations contribute toward the operations per second limits and so can negatively impact attainable throughput.

Standard Tier

The Standard tier is what most people would think of as conventional object storage. It is Internet-facing, usually with its clients accessing it from the internet. Access from Vultr Cloud Compute is possible, of course, but the general expectation is that it is low throughput and high latency. This makes it good for infrequently accessed objects and the accretion of large amounts of data retained over long periods.

Operation Type Object Size Objects/s Throughput (MiB/s)
PUT 4 KB 469.9 1.8
PUT 64 KB 343.2 20.1
PUT 512 KB 393.6 190.1
PUT 1024 KB 595.1 567.4
PUT 10240 KB 57.9 552.8
GET 4 KB 361.3 1.3
GET 64 KB 422.4 25.7
GET 512 KB 390.4 190.6
GET 1024 KB 613.7 585.3
GET 10240 KB 59.7 570.1

Premium Tier

The Premium tier has a 25% increase in operations per second and a 33% increase in throughput relative to the Standard tier. It also has slightly better write performance than Standard. This makes it better suited to workloads that depend on better performance generally, particularly with writes, but still are geared towards clients accessing from the internet that expect relatively low throughput and high latency.

Operation Type Object Size Objects/s Throughput (MiB/s)
PUT 4 KB 560.5 2.1
PUT 64 KB 540.7 33.0
PUT 512 KB 539.7 263.5
PUT 1024 KB 696.2 650.4
PUT 10240 KB 77.2 736.9
GET 4 KB 613.7 2.1
GET 64 KB 569.2 34.7
GET 512 KB 569.2 277.9
GET 1024 KB 793.7 756.9
GET 10240 KB 79.4 757.9

Performance Tier

The Performance tier is backed by NVMe storage but aims at a cost efficiency rather than maximum performance. It allows for significantly more operations per second and higher throughput than the Premium tier. It is also intended for access primarily by Vultr Cloud Compute in the same data center rather than clients over the Internet. This is not to say that it can’t be accessed over the Internet, but rather that in-data center clients will see benefits from local connectivity as compared to slower tiers.

Operation Type Object Size Objects/s Throughput (MiB/s)
PUT 4 KB 2,462.2 9.3
PUT 64 KB 2,817.8 171.9
PUT 512 KB 1,973.8 963.7
PUT 1024 KB 1,010.5 963.7
PUT 10240 KB 101.7 970.2
GET 4 KB 2,872.9 10.9
GET 64 KB 2,376.4 145.0
GET 512 KB 1,974.3 964.0
GET 1024 KB 1,010.6 963.8
GET 10240 KB 101.1 964.1

Accelerated Tier

The Accelerated tier is designed to allow shared data access from within the data center at higher rates of both throughput and operations per second. Being accessible from the internet at the same time makes it ideally suited for data ingestion to cluster computing, such as AI training. Depositing data on the Accelerated tier of Object Storage makes it accessible for import onto, for example, Vultr File System, where it can then be used at very high throughputs required for keeping GPU clusters flush with data.

Operation Type Object Size Objects/s Throughput (MiB/s)
PUT 4 KB 8,483.7 32.3
PUT 64 KB 8,470.9 517.0
PUT 512 KB 6,749.3 3,295.5
PUT 1024 KB 3,837.1 3,659.3
PUT 10240 KB 400.8 3,822.4
GET 4 KB 6,611.3 25.2
GET 64 KB 8,255.3 503.8
GET 512 KB 6,625.8 3,235.2
GET 1024 KB 3,987.0 3,802.3
GET 10240 KB 438.7 4,184.1

Replicating These Results

You can replicate these benchmarks using the warp utility.

  1. Install the warp utility and any dependencies. It can be found at its official GitHub repository, but your distribution likely has it available as a package. In most distributions the package is simply called warp.

  2. Warp operates by having a Warp Server that coordinates one or more Warp Clients, each of which generates load against one or more S3 servers. This allows synchronized workload generation across multiple client machines. In our testing, we used eight Vultr VPS in the same region as the object storage. To set up clients, on each server you’re using as a client you must run the command:

    console
    $ warp client [listenaddress:port]
    

    Where listenaddress:port is an optional specification of the port and address the Warp Client should listen on for connections from the Warp Server. Be sure that your firewall on the hosts is opened to allow communication to the ports you’ve configured your clients to use.

  3. Once Warp Clients are running you can use the following command to execute the benchmark from the Warp Server. Note that the server is not generating any client activity itself. Rather, it is merely instructing the clients to generate the workload.

    console
    $ warp get \
        --duration=2m \
        --warp-client=$WORKERS \
        --host=$S3HOST \
        --bucket=bucket-name-warp \
        --access-key=$S3ACCESS_KEY \
        --secret-key=$S3SECRET_KEY \
        --tls \
        --obj.size=512k \
        --rps-limit=$RATE
    

    In the above command, get can be replaced by any of several benchmarks you may wish to run from amongst the list of available benchmark workloads. For a full list of available benchmark types, see the Warp documentation included at its GitHub repository. The options and variables above have the following meanings:

    • $WORKERS This is the list of Warp Clients to coordinate. The presence of the --warp-client option indicates that Warp is acting as a server.
    • $S3HOST The name of the S3 host to test against.
    • --bucket The name of the bucket that Warp should test against.
    • $S3ACCESS_KEY Your access key.
    • $S3SECRET_KEY Your secret key.
    • --obj.size The size of objects to be used in testing.
    • $RATE Sets the rate of requests per client to limit 429 status returns due to exceeding the object storage rate limit. This should be equal to or slightly above the published request rate of the storage tier divided by the number of Warp Clients in use.

    Refer to the official Warp documentation for complete details.

Tuning Tips for Best Performance with Vultr Object Storage

  • Enable high levels of parallelism by making more simultaneous requests. Increase the number of clients making requests or the number of threads or workers per client. For instance, we used eight Vultr VPS as clients for our testing with warp rather than testing with a single s3cmd.
  • Object Storage is inherently high latency, so also allow for more requests to be in flight waiting for replies so as not to allow high latency to lower your throughput artificially. For example, when writing code to connect to your Vultr Object Storage buckets, leverage your language’s object storage library’s support for queues to allow multiple requests across your connections.
  • Honor 429 status codes by slowing down your own request rate. Remember that requests that go unfulfilled because you have exceeded your operations rate limit cost resources as well. Most client applications support some form of back-off after receiving 429 status, allowing them to avoid blasting the Object Storage with requests that will only result in more 429 status responses.
  • If you are merely using s3cmd, consider moving to s5cmd instead. It has considerably improved performance, particularly with regard to highly parallelizable workloads. Refer to the official GitHub repository for further details (specifically the Configuring Concurrency section).
  • If you’re using Loki with Vultr Object Storage, it benefits from 3 or more ingesters and queriers, increasing parallelism. Rclone has the --transfers flag to increase parallelism and the --retries-sleep to handle rate limiting.

Comments