Storage Performance for Vultr Block Storage

Updated on 25 February, 2026

Comprehensive benchmarking analysis of Vultr Block Storage performance, rate limits, tiers, and fio-based replication methodology.


We've run extensive benchmarking of Vultr Block Storage. For additional context on the performance metrics referenced throughout this document and to better understand how storage performance is measured, benchmarked, and compared, see What Are the Fundamentals of Storage Performance?. Vultr Block Storage is designed for use with Vultr Cloud Compute instances and is offered in two performance tiers: HDD Block and NVMe Block.

HDD Block is designed to be cost effective by relatively low performance. It’s available at all Vultr sites. As the name suggests, it is largely composed of hard disk drives, but it also has some flash storage that it uses to accelerate certain types of metadata operations and caches. Your data is also spread redundantly across a large number of such drives to increase performance, but it is still fundamentally limited by the speed of hard disk drives.

NVMe Block is designed to be much higher performance, but as a consequence, it costs more. It is available at a large number of Vultr sites, especially those with GPU or high-performing CPU systems. As the name suggests, it is composed of NVMe flash drives and so needs no further acceleration. It too is redundant, but its mode of redundancy is chosen for speed, not lower cost. Again, your data is spread across a large number of such drives to increase performance, but the increased performance of NVMe drives makes it significantly faster.

Rate Limits

Tiers of Vultr Block Storage are limited at the hypervisor to not allow a given subscription to exceed certain IOPS and throughput levels for sustained periods. These limits are imposed to avoid situations where one VM instance consumes all the available network throughput or processing power for its storage workload, limiting what would be available for competing workloads.

Block Storage rate limits also allow for short bursts of up to 60 seconds where up to 150% of the sustained limit can be achieved. The burst capability requires a period of time where lower than the sustained limits are requested in order to make burst capacity available after being consumed.

Tier Sustained IOPS Limit Sustained Throughput Limit
HDD Block 500 IOPS 100 MB per second (95.3 MiB)
NVMe Block 10,000 IOPS 400 MB per second (381.4 MiB)

It is important to understand how these limits interact with choice of block size. For instance, 500 IOPS at 4 KB block size will result in only 2 MB/s. For example, you will need a block size in excess of 209.7 KB to reach the 100 MB/s throughput limit of HDD Block before hitting the 500 IOPS limit. That is to say that 209.7 KB × 500 IOPS = 104.85 MB (≈100 MiB).

Benchmark Results

We used the utility fio to measure performance for several block sizes, 4 KB, 64 KB, 512 KB, 1024 KB, and 4096 KB. We performed tests with 100% read (both random and sequential), 100% write (again both random and sequential), and a mixed workload of both reads and writes (50%/50%).

We performed these tests across a wide variety of VM instance plans so as to see any break points where insufficient CPU or memory in the plan could impact storage performance. We did not find such a break point with block performance even on plans with only 1 core and 1 GB of RAM.

NVMe Block

This table shows the performance results for each of the three IO Types at each of the tested block sizes. In each case we used a queue depth of 4, and a job count of 4.

IO Type Block Size Mean IOPS Mean Throughput (MiB/s) Mean Latency (ms)
randwrite, randread, and randrw 4 KB ≈10,000 ≈40 2.7-3.2
randwrite, randread, and randrw 64 KB ≈6,000 ≈381 4-5
randwrite, randread, and randrw 512 KB ≈750 ≈381 40-50
randwrite, randread, and randrw 1 MB ≈380 ≈381 80-100
randwrite, randread, and randrw 4 MB ≈95 ≈381 320-420

It is important to note that this table lists mean throughput in MiB/s whereas the rate limits are in MB/s. For example, 381.4 MiB/s is 400 MB/s which is the throughput rate limit for NVMe Block storage.

At the 4 KB block size, the individual IOs are so small that the limitation to throughput is the number of IOPS reaching the IOPS rate limit. For larger block sizes, the throughput rate limits are instead hit and that keeps the number of IOPS lower than the IOPS rate limit.

Latency rises as block size increases because we’ve hit the throughput rate limit. Rate limiting operates by not answering requests until doing so would keep the throughput below its rate limit, effectively injecting latency.

HDD Block

For HDD Block, results are similar in that either the IOPS rate limit or the throughput rate limits are hit. The primary difference is that the rate limits for HDD Block are lower.

Overall Conclusion

The overall conclusion is that both HDD Block and NVMe Block can achieve their rate limited speeds of 500 IOPS and 100 MB/s for HDD and 10,000 IOPS and 400 MB/s for NVMe, even on the smallest instance plans.

Replicating These Results

You can replicate these results for yourself by using fio at any of the blocksizes mentioned at any operations mix you would like.

  1. Install the fio utility and any dependencies. It can be found on git.kernel.org, but your distribution likely has it available as a package. In most distributions the package is simply called fio. You should also install libaio so that it is available to fio. The package is usually called either libaio_dev or libaio_devel, depending on your distribution.

  2. When running fio, you will need to create a job configuration file that you can reference and then run a command line that points at the job file.

    ini
    [FIOJOB]
    filename=/mnt/vbs/fio.raw
    size=500G
    random_generator=lfsr
    buffered=0
    direct=1
    invalidate=0
    ioengine=libaio
    rw=randwrite
    bs=4k
    iodepth=4
    numjobs=16
    runtime=900
    loops=1
    time_based=1
    

    Key values to change to match the workload you are testing are:

    • filename= This should be a file on the file system where you are testing. If you are testing directly on the block device itself, understand that the test is destructive to any data contained in the file and will destroy any file system on the raw device.
    • direct= 1 enables O_DIRECT, 0 disables it. Use direct=1 with ioengine=libaio.
    • ioengine= We recommend libaio for best results, but you may wish to compare it with sync or psync. We used libaio in our testing.
    • rw= randread, randwrite, and randrw are the most useful options.
    • bs= Block size. We used 4k, 64k, 512k, 1M, and 4M in our testing.
    • iodepth= The queue depth per job. We used 4 in our testing.
    • numjobs= The number of simultaneous jobs to run. We used 4 in our testing.
  3. Then you can reference the job config from the command line:

    console
    $ fio \
        --eta=never \
        --status-interval=5000ms \
        --output-format=json+ \
        $FIOJOBFILE
    

    Where $FIOJOBFILE is the path to the job file created above. See the fio documentation for more details.

Tuning Tips for Best Performance with Vultr Block Storage

  • Enable higher levels of parallelism by making more simultaneous requests. Increase the number of processes, threads, or workers issuing I/O operations. In the fio benchmarking utility, parallelism can be increased by increasing numjobs and iodepth (the number of requests each job allows to be in flight without a response).

  • Larger queue depths allow more requests to remain in flight while waiting for responses, preventing high latency from artificially reducing throughput.

  • In many cases, asynchronous I/O can increase performance. Some applications can leverage libaio via a configurable option. In fio, enable asynchronous I/O with ioengine=libaio.

  • In most cases, caches should be disabled to use Direct I/O. In fio, this is achieved with direct=1 and buffered=0.

Comments