Benchmarking Guest on FreeNAS ZFS, bhyve and ESXi

FreeNAS 11 introduces a GUI for FreeBSD’s bhyve hypervisor.  This is a potential replacement for the ESXi + FreeNAS All-in-One “hyper-converged storage” design.

Hardware

Hardware is based on my Supermicro Microserver Build

  • Xeon D-1518 (4 physical cores, 8 threads) @ 2.2GHz
  • 16GB DDR4 ECC memory
  • 4 x 2TB HGST RAID-Z, 100GB Intel DC S3700s for ZIL (over-provisioned at 8GB) on an M1015.  In Environments 1 and 2 this was passed to FreeNAS via VT-d.
  • 2 x Samsung FIT USBs for booting OS (either ESXi or FreeNAS)
  • 1 x extra DC S3700 used as ESXi storage for the FreeNAS VM to be installed on in environments 1 and 2 (not used in environment 3).

Environments

E1. ESXi + FreeNAS 11 All-in-one.

Setup per my FreeNAS on VMware Guide.  Ubuntu VM with Paravirtual is installed as an ESXi guest, on NFS storage backed by ZFS on FreeNAS which has raw access to disks running under the same ESXi hypervisor using virtual networking.  FreeNAS given 2 cores and 10GB memory.  Guest gets 1GB memory.  Guest tested with 1C and 2C.

E2. Nested bhyve + ESXi + FreeNAS 11 All-in-one.

Nested virtualization test.  Ubuntu VM with VirtIO is installed as a bhyve guest on FreeNAS which has raw access to disks running under the ESXi Hypervisor.  FreeNAS given 4 cores and 12GB memory.  Guest gets 1GB memory.  Guest tested with 1C and 2C.  What is neat about this environment is it could be used as a stepping stone if migrating from environment 1 to environment 3 or vice-versa (I actually tested migrating with success).

E3. bhyve + FreeNAS 11

Ubuntu VM with VirtIO is installed as a bhyve guest on FreeNAS on bare metal.  Guest gets 1GB memory.  Guest was backed with a ZVOL since that was the only option.  Tested wih 1C and 2C.

All environments used FreeNAS 11, E1 and E2 used VMware ESXi 6.5

Testing Notes

A reboot of the guest and FreeNAS was performed between each test so as to clear ZFS’s ARC (in memory read cache).  The sysbench test files were recreated at the start of each test.  The script I used for testing is https://github.com/ahnooie/meta-vps-bench with networking tests removed.

No attempts on tuning were made in any environment.  Just used the sensible defaults.

Disclaimer on comparing Apples to Oranges

This is not a business or enterprise level comparison.  This test is meant to show how an Ubuntu guest performs in various configurations on the same hardware with constraints of a typical budget home server running a free “hyperconverged” solution–a hypervisor and FreeNAS storage on the same physical box.  Not all environments are meant to perform identically…my goal is just to see if the environments perform “good enough” for home use.  An obvious example of this is environments using NFS backed storage are going to perform slower than environments with local storage… but it should still at the very least max out a 1Gbps ethernet.  This set of tests is designed to benchmark how I would setup each environment given the constraint of one physical box running both the hypervisor and FreeNAS + ZFS as the storage backend.  The test is limited to a single guest VM.  In the real world dozens, if not hundreds or even thousands of VMs are running simultaneously so advanced hypervisor features like memory deduplication are going to make a big difference.  This test made no attempt to benchmark such.  This is not an apples to apples test, so be careful what conclusions you derive from it.

CPU 1 and 2 threaded test

I’d say these are equivalent, which probably shows how little overhead there is from the hypervisor these days, though nested virtualization is a bit slower.

CPU 1 and 2 threaded

CPU 4 threaded test

Good to see that 2 cores actually performs faster than 1 core on a 4 threaded test.  Nothing to see here…

CPU 4 threads

Memory Operations Per Second

Horrible performance with nested, but with the hypervisor on bare metal ESXi and bhyve performed identically.

Memory OPS

Memory MB/s

Once again nested virtualization was slow.. other than that neck and neck performance.

Memory Test

OLTP Transactions Per Second

The ESXi environment clearly takes the lead over bhyve, especially as the number of  cores / threads started increasing.  This is interesting because ESXi outperforms despite an I/O penalty from using NFS so ESXi is more than making up for that somewhere else.

OLTP Test

Disk I/O Requests per Second

Clearly there’s an advantage to using local ZFS storage vs NFS.  I’m a bit disappointing in the nested virtualization performance since from a storage standpoint it should be equivalent to bare metal FreeNAS, but may be due to the slow memory performance in that environment.

Disk Ransom I/O

Disk Sequential Read/Write MBps

No surprises, ZFS local storage is going to outperform NFS

Disk Sequential I/O Well there you have it.  I think it’s safe to say that bhyve is a viable solution for home (although I would like to see more people using it in the wild before considering it robust–I imagine we’ll see more of that now that FreeNAS has a UI for it).  For low resource VMs E2 (nested virtualization)  is a way to migrate between E1 and E3–but it’s not going to work for high performance VMs because of the memory performance hit.

$5 DigitalOcean, Vultr & Lightsail Benchmarks

Amazon Lightsail has entered the VPS market, competing directly with DigitalOcean and Vultr.  I for one welcome more competition in the $5 cloud server space.  I wanted to see how they perform so I spun up 24 cloud servers, 8 for each provider and ran some benchmarks.

$5 Cloud Server Providers Compared

DigitalOcean, Vultr, and Amazon Lightsail offer more expensive plans, but this post is dealing with the low-end $5 plans.  Here’s how they compare:

DigitalOcean

  • DigitalOcean Logo1 CPU Core
  • 512MB Memory
  • 20GB HDD (extra block storage @ $0.10/GB/month)
  • 1TB Bandwidth ($0.02/GB overage fee in U.S.).
  • Free DNS
  • Best team management – DigitalOcean lets you create multiple-teams and you can add and remove users from those teams.
  • 99.99% SLA
  • Floating IPs
  • Ubuntu, FreeBSD, Fedora, Debian, CoreOS, CentOS

Vultr

  • Vultr Logo1 CPU Core
  • 768MB Memory
  • 15GB HDD (extra block storage @ $0.10/GB/month)
  • 1TB Bandwidth ($0.02/GB overage fee in U.S.)
  • Free DNS
  • Account sharing – allows you to setup multi-user access.
  • 100% SLA
  • Floating IPs (currently can’t setup automatically, requires support setup)
  • Ubuntu, FreeBSD, Fedora, Debian, CoreOS, CentOS, Windows, or any OS with your Custom ISO.

Amazon Lightsail

  • Amaon LightSail Logo1 CPU Core
  • 512MB Memory
  • 20GB HDD (block storage not available)
  • 1TB Bandwidth ($0.09/GB overage fee in U.S.)
  • 3 Free DNS zones (redundancy across TLDs as well).
  • 99.95% SLA
  • Amazon Linux or Ubuntu

Geographic Locations

All three providers have multiple geographic locations worldwide.  Vultr has the most locations in the United States, while Amazon has more geographic locations in the world (although only Virginia is available to LightSail at this point in time).

DigitalOcean Locations

DigitalOcean World Map showing DC locations in Toronto, San Francisco, New York City, London, Amsterdam, Frankfurt, Bangalore, and Singapore

Vultr Global Locations

Vultr map showing DC locations in Seattle, Silicon Valley, Los Angeles, Dallas, Miami, Atlanta, New Jersey, Chicago, London, Frankfurt, Amsterdam, Paris, Singapore, Tokyo, Sydney

Amazon Lightsail Global Infrastructure

AWS world map showing DC regions in Virginia, Oregon, California, Ireland, Frankfurt, Singapore, Tokyo, Seoul, Sydney, Sao Paulo, and China

API Automation

All providers offer an API.  In practice DigitalOcean has been around the longest and thus is more likely to be supported in automation tools (such as Ansible).  I expect support for the other APIs to catch up soon.

And finally…

Benchmarks

CPU Test – Calculating Primes

Number of seconds needed to compute prime numbers.  On the CPU test Amazon Lightsail consistently outperformed, with Vultr coming in second and DigitalOcean last.  CPU1 and CPU2 are 1 and 2 threads respectively calculating primes up to 10,000.  CPU4 is a 4-threaded test calculating primes up to 100,000.

Lower is better.

Memory

Lower is better.

(I accidentally omitted the memory test from my parser script and didn’t realize it until the last test ran, so this is the average of 4 results per provider)

OLTP (Online transaction processing)

Higher is better.

The OLTP load test simulates a transactional database, in general it measures latency on random inserts, updates, and reads against a MariaDB database.  CPU, memory, and storage latency all can effect performance so it’s a good all around indicator.  This test measures the number of transactions per second.  In this area Vultr outperformed DigitalOcean and Amazon Lightsail in 2 and 4 thread tests, while Lightsail took the lead in the 8-thread test.  I don’t know why Lightsail started to perform better under multi-threaded tests, however, my guess is that while Lightsail doesn’t offer the fastest single-threaded storage IOPS it may have better multi-threaded IOPS–but I can’t say for sure without doing some different kinds of tests.  DigitalOcean performed the worst in all tests–probably due to it’s slower CPU and memory speed.

Random IOPS

Higher is better.

Transactions per second.  In random IOPS Vultr provided the best consistent performance, DigitalOcean comes in second place with wide variance, and Lightsail comes in last, but it was by far the most predictable.

Sequential Reads / Writes / Re-writes

Higher is better.

This simply measures sequential read/write speeds on the hard drive.   Vultr offers the most consistent high performance, DigitalOcean is all over the place but generally better than Lightsail which comes in last.

Latency (Ping ms) U.S. Locations

Lower is better.

The U.S. latency is all close enough that it doesn’t matter.

Latency (Ping ms) Worldwide Locations

Lower is better.

International Latency, again the results are pretty close.

Download Speed Tests from U.S. Locations

Higher is better.

Downloading data from various locations.  It’s really hard to conclude any meaningful analysis from this… the faster peering in New York probably has to do with DigitalOcean and Vultr being located in New York vs Lightsail’s location in Virginia.

 

Upload Speed Tests to U.S. Locations

Higher is better.

Due to the similarities in the test results I think the bandwidth constraints are on the other side, or at peering.

Download Speed Tests from Worldwide Locations

Higher is better.

Who knows what one could conclude from this, it seems like various providers have different quality peering to different worldwide locations, but there are so many variables it’s hard to say.

 

Upload Speed Tests to Worldwide Locations

Higher is better.

Similar groupings for the most part.

Testing Methodology

I spun up 24 x $5 servers, 8 for each VPS provider.  I spun up 12 servers yesterday and ran tests, destroyed the VMs, then created 12 new servers today and repeated the tests.  All tests were run in the Eastern United States.   I chose that region because the only location available currently in Amazon Lightsail is Virginia, so to get as close as I could I deployed Vultr and DigitalOcean servers out of their New York (and New Jersey) data centers.  New York is a great place to put a server if you’re trying to provide low latency to the major populations in the United States and Europe without using a CDN.

If the provider had multiple data centers in a region I tried to spread them out.

  • DigitalOcean – I deployed 4 servers in NYC3, 2 in NYC2, and 2 in NYC1.
  • Vultr – All 8 servers deployed in their New Jersey data center.
  • Amazon Lightsail.  Deployed in their Virginia location, 2 in each of their four AWS high availability zones.

All the tests I ran are relatively short duration, I did not benchmark sustained loads which may produce different results.  My general use case is a web-server or small build server with intermittent workloads.  I often spin up servers for a few hours or days and then destroy them once they’re done with their tasks.

Testing Scripts

The testing scripts I used are available in my GitHub meta-vps-bench repository.  The testing scripts are very rudimentary and could be improved.  It runs sysbench and speedtest benchmarks.  The following commands were run on each server as root:

apt update
apt upgrade
reboot
git clone https://github.com/ahnooie/meta-vps-bench.git
cd meta-vps-bench
./setup.sh
./bench.sh
./parse.sh
cat speed*.result

I tried to stagger starting the tests so that multiple speedtests against the same location had a low risk of occurring at the same time… but it may not always work out that way.  I ran all tests twice per server which gives 48 total results (16 for each provider).

This script is for testing.  I do NOT recommend running this on production servers.

Data

I have published my results to Tableau’s Public Cloud.

Profit Center

This was a lot of work, give Ben some money!

Sign up for DigitalOcean using this link and I’ll get $25 service credit!

Sign up for Vultr using this link and I’ll get $30 credit!

Sign up for Amazon Lightsail and I get nothing.  Oh well.

Well, that’s it.