ZFS Dataset Hierarchy | Data Hoarder Edition

OpenZFS LogoZFS is flexible and will let you name and organize datasets however you choose–but before you start building datasets there’s some ways to make management easier in the long term.  I’ve found the following convention works well for me.  It’s not “the” way by any means, but I hope you will find it helpful, I wish some tips like this had been written when I built my first storage system 4 years ago.

Here are my personal ZFS best practices and naming conventions to structure and manage ZFS data sets.

ZFS Pool Naming

I never give two zpools the same name even if they’re in different servers in case there is the off-chance that sometime down the road I’ll need to import two pools into the same system.  I generally like to name my zpool tank[n] where is an incremental number that’s unique across all my servers.

So if I have two servers, say stor1 and stor2 I might have two zpools :

stor1.b3n.org: tank1
stor2.b3n.org: tank2

Top Level ZFS Datasets for Simple Recursive Management

Create a top level dataset called ds[n] where n is unique number across all your pools just in case you ever have to bring two separate datasets onto the same zpool.  The reason I like to create one main top-level dataset is it makes it easy to manage high level tasks recursively on all sub-datasets (such as snapshots, replication, backups, etc.).  If you have more than a handful of datasets you really don’t want to be configuring replication on every single one individually.  So on my first server I have:

tank1/ds1

I usually mount tank/ds1 as readonly from my CrashPlan VM for backups.  You can configure snapshot tasks, replication tasks, backups, all at this top level and be done with it.

freenas_snapshot_pruning
ZFS snaps and pruning recursively managed at the top level dataset

Name ZFS Datasets for Replication

One of the reasons to have a top level dataset is if you’ll ever have two servers…

stor1.b3n.org
   | - tank1/ds1

stor2.b3n.org
   | - tank2/ds2

I replicate them to each other for backup.  Having that top level ds[n] dataset lets me manage ds1 (the primary dataset on the server) completely separately from the replicated dataset (ds2) on stor1.

stor1.b3n.org
 | - tank1/ds1
 | - tank2/ds2 (replicated)

stor2.b3n.org
 | - tank2/ds2
 | - tank1/ds1 (replicated)

Advice for Data Hoarders.  Overkill for the Rest of Us

supermicro_zfs

The ideal is we backup everything.  But in reality storage costs money, WAN bandwidth isn’t always available to backup everything remotely.  I like to structure my datasets such that I can manage them by importance.  So under the ds[n] dataset create sub-datasets.

stor1.b3n.org
 | - tank1/ds1/kirk – very important – family pictures, personal files
 | - tank1/ds1/spock – important – ripped media, ISO files, etc.
 | - tank1/ds1/redshirt – scratch data, tmp data, testing area
 | - tank1/ds1/archive – archived data
 | - tank1/ds1/backups – backups

Kirk – Very Important.  Family photos, home videos, journal, code, projects, scans, crypto-currency wallets, etc.  I like to keep four to five copies of this data using multiple backup methods and multiple locations.  It’s backed up to CrashPlan offsite, rsynced to a friend’s remote server, snapshots are replicated to a local ZFS server, plus an annual backup to a local hard drive for cold storage.  3 copies onsite, 2 copies offsite, 2 different file-system types (ZFS, XFS) and 3 different backup technologies (CrashPlan, Rsync, and  ZFS replication) .  I do not want to lose this data.

Multiple Backup Locations Across the World
Important data is backed up to multiple geographic locations

Spock – Important.  Important data that would be a pain to lose, might cost money to reproduce, but it isn’t catastrophic.  If I had to go a few weeks without it I’d be fine.  For example, rips of all my movies, downloaded Linux ISO files, Logos library and index, etc.  If I lost this data and the house burned down I might have to repurchase my movies and spend a few weeks ripping them again, but I can reproduce the data.  For this dataset I want at least 2 copies, everything is backed up offsite to CrashPlan and if I have the space local ZFS snapshots are replicated to a 2nd server giving me 3 copies.

redshirt_startrek

Redshirt – This is my expendable dataset.  This might be a staging area to store MakeMKV rips until they’re transcoded, I might do video editing here or test out VMs.  This data doesn’t get backed up… I may run snapshots with a short retention policy.  Losing this data would mean losing no more than a days worth of work.  I might also run zfs sync=disabled to get maximum performance here.  And typically I don’t do ZFS snapshot replication to a 2nd server.  In many cases it will make sense to pull this out from under the top level ds[n] dataset and have it be by itself.

Backups – Dataset contains backups of workstations, servers, cloud services–I may backup the backups to CrashPlan or some online service and usually that is sufficient as I already have multiple copies elsewhere.

Archive – This is data I no longer use regularly but don’t want to lose. Old school papers that I’ll probably never need again, backup images of old computers, etc.  I set set this dataset to compression=gzip9, and back it up to CrashPlan plus a local backup and try to have at least 3 copies.

Now, you don’t have to name the datasets Kirk, Spock, and Redshirt… but the idea is to identify importance so that you’re only managing a few datasets when configuring ZFS snapshots, replication, etc.  If you have unlimited cheap storage and bandwidth it may not worth it to do this–but it’s nice to have the option to prioritize.

Now… once I’ve established that hierarchy I start defining my datasets that actually store data which may look something like this:

stor1.b3n.org
| - tank1/ds1/kirk/photos
| - tank1/ds1/kirk/git
| - tank1/ds1/kirk/documents
| - tank1/ds1/kirk/vmware-kirk-nfs
| - tank1/ds1/spock/media
| - tank1/ds1/spock/vmware-spock-nfs
| - tank1/ds1/spock/vmware-iso
| - tank1/ds1/redshirt/raw-rips
| - tank1/ds1/redshirt/tmp
| - tank1/ds1/archive
| - tank1/ds1/archive/2000
| - tank1/ds1/archive/2001
| - tank1/ds1/archive/2002
| - tank1/ds1/backups
| - tank1/ds1/backups/incoming-rsync-backups
| - tank1/ds1/backups/windows
| - tank1/ds1/backups/windows-file-history

 

With this ZFS hierarchy I can manage everything at the top level of ds1 and just setup the same automatic snapshot, replication, and backups for everything.  Or if I need to be more precise I have the ability to handle Kirk, Spock, and Redshirt differently.

 

FreeNAS Mini XL, 8 bay Mini-ITX NAS

Catching up on email, I saw a Newsletter from iX Systems announcing the FreeNAS Mini XL (the irony).  On the new FreeNAS Mini page it looks just like the FreeNAS mini but taller to accommodate 8-bays.

Available on Amazon starting at $1,500 with no drives.

Here’s the Quick Start Guide and Data Sheet.

The pictures show what appears to be equipped with the Asrock C2750d4i motherboard which has an 8-core Atom / Avoton processor.  With the upcoming FreeNAS 9.10 (based on FreeBSD 10) it should be able to run the bhyve hypervisor as well (at least from CLI, might have to wait until FreeNAS 10 for a bhyve GUI) meaning a nice all-in-one hypervisor with ZFS without the need for VT-d.   This may end up being a great successor to the HP Microserver for those wanting to upgrade with a little more capacity.

The case is the Ablecom CS-T80 so I imagine we’ll start seeing it from Supermicro soon as well.  According to Ablecom it has 8 hotswap bays plus 2 x 2.5″ internal bays and still managed to have room for a slim DVD/Blu-Ray drive.

ablecom_cs_t80It’s really great to see an 8-bay Mini-ITX NAS case that’s nicer than the existing options out there.  I hope the FreeNAS Mini XL will have an option for a more powerful motherboard even if it means having to use up the PCI-E slot with an HBA–I’m not really a fan of the Marvell SATA controllers on that board, and of course a Xeon-D would be nice.

 

 

VMware vs bhyve Performance Comparison

Playing with bhyve

Here’s a look at Gea’s popular All-in-one design which allows VMware to run on top of ZFS on a single box using a virtual 10Gbe storage network.  The design requires an HBA, and a CPU that supports VT-d so that the storage can be passed directly to a guest VM running a ZFS server (such as OmniOS or FreeNAS).  Then a virtual storage network is used to share the storage back to VMware.

vmware_all_in_one_with_storage_network
VMware and ZFS: All-In-One Design

bhyve, can simplify this design since it runs under FreeBSD it already has a ZFS server.  This not only simplifies the design, but it could potentially allow a hypervisor to run on simpler less expensive hardware.  The same design in bhyve eliminates the need to use a dedicated HBA and a CPU that supports VT-d.

freebsd_bhyve
Simpler bhyve design

I’ve never understood the advantage of type-1 hypervisors (such as VMware and Xen) over Type-2 hypervisors (like KVM and bhyve).  Type-1 proponents say the hypervisor runs on bare metal instead of an OS… I’m not sure how VMware isn’t considered an OS except that it is a purpose-built OS and probably smaller.  It seems you could take a Linux distribution running KVM and take away features until at some point it becomes a Type-1 hypervisor.  Which is all fine but it could actually be a disadvantage if you wanted some of those features (like ZFS).  A type-2 hypervisor that supports ZFS appears to have a clear advantage (at least theoretically) over a type-1 for this type of setup.

In fact, FreeBSD may be the best visualization / storage platform.  You get ZFS and bhyve, and also jails.  You really only need to run bhyve when virtualizing a different OS.

bhyve is still pretty young, but I thought I’d run some tests to see where it’s at…

Environments

This is running on my X10SDV-F Datacenter in a Box Build.

In all environments the following parameters were used:

  • Supermico X10SDV-F
  • Xeon D-1540
  • 32GB ECC DDR4 memory
  • IBM ServerRaid M1015 flashed to IT mode.
  • 4 x HGST Ultrastar 7K300 HGST 2TB enterprise drives in RAID-Z
  • One DC S3700 100GB over-provisioned to 8GB used as the log device.
  • No L2ARC.
  • Compression = LZ4
  • Sync = standard (unless specified).
  • Guest (where tests are run): Ubuntu 14.04 LTS, 16GB, 4 cores, 1GB memory.
  • OS defaults are left as is, I didn’t try to tweak number of NFS servers, sd.conf, etc.
  • My tests fit inside of ARC.  I ran each test 5 times on each platform to warm up the ARC.  The results are the average of the next 5 test runs.
  • I only tested an Ubuntu guest because it’s the only distribution I run in (in quantity anyway) addition to FreeBSD, I suppose a more thorough test should include other operating systems.

The environments were setup as follows:

1 – VM under ESXi 6 using NFS storage from FreeNAS 9.3 VM via VT-d

  • FreeNAS 9.3 installed under ESXi.
  • FreeNAS is given 24GB memory.
  • HBA is passed to it via VT-d.
  • Storage shared to VMware via NFSv3, virtual storage network on VMXNET3.
  • Ubuntu guest given VMware para-virtual drivers

2 – VM under ESXi 6 using NFS storage from OmniOS VM via VT-d

  • OmniOS r151014 LTS installed under ESXi.
  • OmniOS is given 24GB memory.
  • HBA is passed to it via VT-d.
  • Storage shared to VMware via NFSv3, virtual storage network on VMXNET3.
  • Ubuntu guest given VMware para-virtual drivers

3 – VM under FreeBSD bhyve

  • bhyve running on FreeBSD 10.1-Release
  • Guest storage is file image on ZFS dataset.

4 – VM under FreeBSD bhyve sync always

  • bhyve running on FreeBSD 10.1-Release
  • Guest storage is file image on ZFS dataset.
  • Sync=always

Benchmark Results

MariaDB OLTP Load

This test is a mix of CPU and storage I/O.  bhyve (yellow) pulls ahead in the 2 threaded test, probably because it doesn’t have to issue a sync after each write.  However, it falls behind on the 4 threaded test even with that advantage, probably because it isn’t as efficient at handling CPU processing as VMware (see next chart on finding primes).
sysbench_oltp

Finding Primes

Finding prime numbers with a VM under VMware is significantly faster than under bhyve.

sysbench_primes

Random Read

byhve has an advantage, probably because it has direct access to ZFS.

sysbench_rndrd

Random Write

With sync=standard bhyve has a clear advantage.  I’m not sure why VMware can outperform bhyve sync=always.  I am merely speculating but I wonder if VMware over NFS is translating smaller writes into larger blocks (maybe 64k or 128k) before sending them to the NFS server.

sysbench_rndwr

Random Read/Write

sysbench_rndrw

Sequential Read

Sequential reads are faster with bhyve’s direct storage access.

sysbench_seqrd

Sequential Write

What not having to sync every write will gain you..

sysbench_seqwr

Sequential Rewrite

sysbench_seqrewr

 

Summary

VMware is a very fine virtualization platform that’s been well tuned.  All that overhead of VT-d, virtual 10gbe switches for the storage network, VM storage over NFS, etc. are not hurting it’s performance except perhaps on sequential reads.

For as young as bhyve is I’m happy with the performance compared to VMware, it appears to be a slower on the CPU intensive tests.   I didn’t intend on comparing CPU performance so I haven’t done enough variety of tests to see what the difference is there but it appears VMware has an advantage.

One thing that is not clear to me is how safe running sync=standard is on bhyve.  The ideal scenario would be honoring fsync requests from the guest, however I’m not sure if bhyve has that kind of insight from the guest.  Probably the worst case under this scenario with sync=standard is losing the last 5 seconds of writes–but even that risk can be mitigated with battery backup. With standard sync there’s a lot of performance to be gained over VMware with NFS.  Even if you run bhyve with sync=always it does not perform badly, and even outperforms VMware All-in-one design on some tests.

The upcoming FreeNAS 10 may be an interesting hypervisor + storage platform, especially if it provides a GUI to manage bhyve.

 

Supermicro X10SDV-F Build; Datacenter in a Box

I don’t have room for a couple of rackmount servers anymore so I was thinking of ways to reduce the footprint and noise from my servers.  I’ve been very happy with Supermicro hardware so here’s my Supermicro Mini-ITX Datacenter in a box build.

Supermicro Microtower

Supermicro X10SDV Motherboard

Unlike most processors, the Xeon D is SOC (System on Chip) meaning that it’s built into the motherboard.  Depending on your compute needs, you’ve got a lot of pricing / power flexibility with the Mini-ITX Supermicro X10SDV motherboards with the Xeon D SOC CPU ranging from a budget build of 2 cores to a ridiculous 16 cores rivaling high end Xeon E5 class processors!

How many cores do you want?  CPU/Motherbord Options

x10sdv-4c-tln2f_spec Supermicro board with fan

A few things to keep in mind when choosing a board.  Some come with a FAN (normally indicated by a + after the core count), some don’t.  I suggest getting it with a fan unless you’re putting some serious air flow (such as with a 1U server) through the heatsink.  I got one without a fan and had to do a Noctua mod (below).

Many versions of this board are rated for 7-years lifespan which means they have components designed to last longer than most boards!  Usually computers go obsolete before they die anyway, but it’s nice to have that option if you’re looking for a permanent solution.  A VMware / NAS server that’ll last you 7-years isn’t bad at all!

On the last 5 digits, you’ll see two options: “-TLN2F” and “-TLN4F” this refers to the number network Ethernet ports (N2 comes with 2 x gigabit ports, and N4 usually comes with 2 gigabit plus 2 x 10 gigabit ports).  10 gbe ports may come in handy for storage, and also having 4 ports may be useful if you’re going to run a router VM such as pfSense.

I bought the first model just known as the “X10SDV-F” which comes with 8 cores and 2 gigabit network ports.  This board looks like it’s designed for high density computing.  It’s like cramming dual Xeon E5’s into a Mini-ITX board.  The Xeon D-1540 will well outperform the Xeon E3-1230v3 in most tests, can handle up to 128GB memory, two nics (this also comes in a model that offers two more 10Gbe providing four nics), IPMI, 6 SATA-3 ports, a PCI-E slot, and an M.2 slot.

Supermicro X10SDV-F Motherboard
Supermicro X10SDV-F

IPMI / KVM Over-IP / Out of Band Management

One of the great features of these motherboards is you will never need to plug in a keyboard, mouse, or monitor.  In addition to the 2 or 4 normal Ethernet ports, there is one port off to the side, the management port.  Unlike HP iLO, this is a free feature on the Supermicro motherboards.  The IPMI interface will get a DHCP address. You can download the Free IPMIView software from Supermicro, or use the Android app to scan your network for the IP address.  Login as ADMIN / ADMIN (be sure to change the password).

Supermicro IPMI KVM over IP

You can even reset or power off, and even if the power is off you can power on the server remotely.

 

Supermicro KVM

And of course you also get KVM over IP, which is so low level you can get into the BIOS and even load an ISO file from your workstation to boot off of over the network!

When I first saw IPMI I made sure all my new servers have it.  I hate messing around with keyboards and mice and monitors and I don’t have room for a hardware based KVM solution.  This out of band management port is the perfect answer.  And the best part is the ability to manage your server from remote.  I have used this to power on servers and load ISO files in California from Idaho.

I should note that I would not be exposing the IPMI port over the internet, make sure it’s on it’s behind a firewall accessible only through VPN.

Cooling issue | heatsink not enough

The first boot was fine but it crashed after about 5 minutes while I was in the BIOS setup…. after a few resets I couldn’t even get it to post.  I finally realized the CPU was getting too hot.  Supermicro probably meant for this model to be in a 1U case with good air flow.  The X10SDV-TLN4F is a little extra but it comes with a CPU fan in addition to the 10Gbe network adapters so keep that in mind if you’re trying to decide between the two boards.

Noctua to the Rescue

I couldn’t find a CPU fan designed to fit this particular socket, so I bought a 60MM Noctua.

60MM Noctua FAN on X10SDV-F
60MM Noctua Fan

This is my first Noctua fan and I think it’s the nicest fan I’ve ever owned.  It came packaged with screws, rubber leg things, an extension cord, a molex power adapter, and two noise reducer cables that slow the fan down a bit.  I actually can’t even hear the fan running at normal speed.

Noctua Fan on Xeon D-1540 X10SDV-F
Notcua Fan on Xeon D-1540

There’s not really a good way to screw the fan and the heatsink into the motherboard together, but I took the four rubber things and sort of tucked them under the heatsink screws.  This is  surprisingly a secure fit, it’s not ideal but the fan is not going to go anywhere.

Supermicro CSE-721TQ-250B

This is what you would expect from Supermicro, a quality server-grade case.  It comes with a 250 watt 80 plus power supply.  Four 3.5″ hotswap bays, trays are the same as you would find on a 16 bay enterprise chassis.  Also it comes with labels numbered from 0 to 4 so you could choose to label starting at 0 (the right way) or 1.  It is designed to fit two fixed 2.5″ drives, one on the side of the HDD cage, and the other can be used on top instead of an optical drive.

The case is roomy enough to work with, I had no trouble adding an IBM ServerRAID M1015 / LSI  9220-8i

CS721

 

I took this shot just to note that if you could figure out a way to secure an extra drive, there is room to fit three drives, or perhaps two drives even with an optical drive, you’d have to use a Y-splitter to power it.  I should also note that you could use the M.2. slot to add another SSD.

supermicro_x10sdv-f_sc721_opened

The case is pretty quiet, I cannot hear it at all with my other computers running in the same room so I’m not sure how much noise it makes.

This case reminds me of the HP Microserver Gen8 and is probably about the same size and quality but I think a little more roomier and with Supermicro IPMI is free.

Compared to the Silverstone DS380 the Supermicro CS721 is a more compact.  The DS380 has the advantage of being able to hold more drives.  The DS380 can fit 8 3.5″ or 2.5″ in hotswap bays plus an additional four 2.5″ fixed in a cage.  Between the two cases I much prefer the Supermicro CS-721 even with less drive capacity.  The DS380 has vibration issues with all the drives populated, and it’s also not as easy to work with.  The CS-721 looks and feels much higher quality.

 

Storage Capacity

cs721_open_doorI loaded mine with two Intel DC S3700 SSDs and 4 x 6TB drives in RAID-Z (RAID-5) the case can provide up to 18TB of storage which is a good amount for any data hoarder wanting to get started.

Other Thoughts

cs721_frontI’m pretty happy with the build, I really like how much power you can get into a microserver these days.  My build has 8 cores (16 threads) and up to 128GB memory, and even up to 18TB of data.  With VMware and ZFS you could run a small datacenter from a box under your desk.

I should be posting some ZFS benchmarks on this build shortly.

 

FreeNAS 9.10 on VMware ESXi 6.0 Guide

This is a guide which will install FreeNAS 9.10 under VMware ESXi and then using ZFS share the storage back to VMware.  This is roughly based on Napp-It’s All-In-One design, except that it uses FreeNAS instead of OminOS.

vmware_all_in_one_with_storage_network

Disclaimer:  I should note that FreeNAS does not officially support running virtualized in production environments.  If you run into any problems and ask for help on the FreeNAS forums, I have no doubt that Cyberjock will respond with “So, you want to lose all your data?”  So, with that disclaimer aside let’s get going:

Update: Josh Paetzel wrote a post on Virtualizing FreeNAS so this is somewhat “official” now.  I would still exercise caution.

Update 2: This guide was originally written for FreeNAS 9.3, I’ve updated it for FreeNAS 9.10.  Also, I believe Avago LSI P20 firmware bugs have been fixed and have been around long enough to be considered stable so I’ve removed my warning on using P20.  Added sections 7.1 (Resource reservations) and 16.1 (zpool layouts) and some other minor updates.

1. Get proper hardware

Example 1: Supermicro 2U Build
SuperMicro X10SL7-F (which has a built in LSI2308 HBA).
Xeon E3-1240v3
ECC Memory
6 hotswap bays with 2TB HGST HDDs (I use RAID-Z2)
4 2.5″ hotswap bays.  2 Intel DC S3700’s for SLOG / ZIL, and 2 drives for installing FreeNAS (mirrored)

Example 2: Mini-ITX Datacenter in a Box Build
X10SDV-F (build in Xeon D-1540 8 core broadwell
ECC Memory
IBM 1015 / LSI 9220-8i HBA
4 hotswap bays with 2TB HGST HDDs (I use RAID-Z)
2 Intel DC S3700’s.  1 for SLOG / ZIL, and one to boot ESXi and install FreeNAS to.

Hard drives.  See info on my Hard Drives for ZFS post.

The LSI2308/M1015 has 8 ports, I like do to two DC S3700s for a striped SLOG device and then do a RAID-Z2 of spinners on the other 6 slots.  Also get one (preferably two for a mirror) drives that you will plug into the SATA ports (not on the LSI controller) for the local ESXi data store.  I’m using DC S3700s because that’s what I have, but this doesn’t need to be fast storage, it’s just to put FreeNAS on.

2. Flash HBA to IT Firmware

As of FreeNAS 9.3.1 or greater you should be flashing to IT mode P20 (looks like it’s P21 now but it’s not available by every vendor yet).

I strongly suggest pulling all drives before flashing.

 LSI 2308 IT firmware for Supermicro

Here’s instructions to flash the firmware: http://hardforum.com/showthread.php?t=1758318

Supermicro firmware: ftp://ftp.supermicro.com/Driver/SAS/LSI/2308/Firmware/IT/

For IBM M1015 / LSI Avago 9220-8i

Instructions for flashing firmware:

https://forums.freenas.org/index.php?threads/confirmation-please-lsi-9211-i8-flashing-to-p20.40373/

LSI / Avago Firmware: http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9211-8i#downloads

(If you already have the card passed through to FreeNAS via VT-d (steps 6-8) you can actually flash the card from FreeNAS using the sas2flash utility using the steps below (in this example my card is already in IT mode so I’m just upgrading it):

(Wait a few minutes, at this point FreeNAS finally crashed.  Poweroff.  FreeNAS, and then reboot VMware)

Warning on P20 buggy firmware:

Some earlier versions of the P20 firmware were buggy, so make sure it’s version P20.00.04.00 or later.  If you can’t P20 in aversion later than P20.00.04.00 then use P19 or P16.

3. Optional: Over-provision ZIL / SLOG SSDs.

If you’re going to use an SSD for SLOG you can over-provision them.  You can boot into an Ubuntu LiveCD and use hdparm, instructions are here: https://www.thomas-krenn.com/en/wiki/SSD_Over-provisioning_using_hdparm  You can also do this after after VMware is installed by passing the LSI controller to an Ubuntu VM (FreeNAS doesn’t have hdparm).  I usually over-provision down to 8GB.

Update 2016-08-10: But you may want to only go to 20GB depending on your setup!  One of my colleagues discovered 8GB over-provisioning wasn’t even maxing out 10Gb network (remember, every write to VMware is a sync so it hits the ZIL no matter what) with 2 x 10Gb fiber lagged connections between VMware and FreeNAS.  This was on an HGST 840z so not sure if the same holds true for the Intel DC S3700… and it wasn’t virtualized setup.  But thought I’d mention it here.

4. Install VMware ESXi 6

ImageThe free version of the hypervisor is here. http://www.vmware.com/products/vsphere-hypervisor.  I usually install it to a USB drive plugged into the motherboard’s internal header.

Under configuration, storage, click add storage.  Choose one (or two) of the local storage disks plugged into your SATA ports (do not add a disk on your LSI controller).

5. Create a Virtual Storage Network.

For this example my VMware management IP is 10.2.0.231, the VMware Storage Network ip is 10.55.0.2, and the FreeNAS Storage Network IP is 10.55.1.2.

Create a virtual storage network with jumbo frames enabled.

VMware, Configuration, Add Networking. Virtual Machine…

Create a standard switch (uncheck any physical adapters).

Image [8]

Image [11]

 

Add Networking again, VMKernel, VMKernel…  Select vSwitch1 (which you just created in the previous step), give it a network different than your main network.  I use 10.55.0.0/16 for my storage so you’d put 10.55.0.2 for the IP and 255.255.0.0 for the netmask.

Image [12]

Some people are having trouble with an MTU of 9000.  I suggest leaving the MTU at 1500 and make sure everything works there before testing an MTU of 9000.  Also, if you run into networking issues look at disabling TSO offloading (see comments).

Under vSwitch1 go to Properties, select vSwitch, Edit, change the MTU to 9000.  Answer yes to the no active NICs warning.

Image [14]

Image [15]

Then select the Storage Kernel port, edit, and set the MTU to 9000.

Image [17]

Image [18]

6. Configure the LSI 2308 for Passthrough (VT-d).

Configuration, Advanced Settings, Configure Passthrough.

Image [19]

Mark the LSI2308 controller for passthrough.

Image [20]

You must have VT-d enabled in the BIOS for this to work so if it won’t let you for some reason check your BIOS settings.

Reboot VMware.

7. Create the FreeNAS VM.

Download the FreeNAS ISO from http://www.freenas.org/download-freenas-release.html

Create a new VM, choose custom, put it on one of the drives on the SATA ports, Virtual Machine version 11, Guest OS type is FreeBSD 64-bit, 1 socket and 2 cores.  Try to give it at least 8GB of memory.  On Networking give it two adapters, the 1st NIC should be assigned to the VM Network, 2nd NIC to the Storage network.  Set both to VMXNET3.

vmware_freenas_networking

SCSI controller should be the default, LSI Logic Parallel.

Choose Edit the Virtual Machine before completion.

If you have a second local drive (not one that you’ll use for your zpool) here you can add a second boot drive for a mirror.

Before finishing the creation of the VM click Add, select PCI Devices, and choose the LSI 2308.

Image [32]

And be sure to go into the CD/DVD drive settings and set it to boot off the FreeNAS iso.  Then finish creation of the VM.

7.1 FreeNAS VM Resource allocation

Also, since FreeNAS will be driving the storage for the rest of VMware, it’s a good idea to make sure it has a higher priority for CPU and Memory than other guests.  Edit the virtual machine, under Resources set the CPU Shares to “High” to give FreeNAS a higher priority, then under Memory allocation lock the guest memory so that VMware doesn’t ever borrow from it for memory ballooning.  You don’t want VMware to swap out ZFS’s ARC (memory read cache).

freenas_vmware_cpu_resource allocation

freenas_vmware_lock_guest_memory

 

8. Install FreeNAS.

Boot of the VM, install it to your SATA drive (or two of them to mirror boot).

freenas_installer_mirrored_boot

After it’s finished installing reboot.

9. Install VMware Tools.

SKIP THIS STEP.  As of FreeNAS 9.10.1 installing VMware should may no longer be necessary–you can skip step 9 and go to 10.  Just leaving this for historical purposes.

In VMware right-click the FreeNAS VM,  Choose Guest, then Install/Upgrade VMware Tools.  You’ll then choose interactive mode.

Mount the CD-ROM and copy the VMware install files to FreeNAS:

Once installed Navigate to the WebGUI, it starts out presenting a wizard, I usually set my language and timezone then exit the rest of the wizard.

Under System, Tunables
Add a Tunable.  Variables should be: vmxnet3_load.  The type should be Loader and the Value YES .

Reboot FreeNAS.  On reboot you should notice that the VMXNET3 NICS now work (except the NIC on the storage network can’t find a DHCP server, but we’ll set it to static later), also you should notice that VMware is now reporting that VMware tools are installed.

vmware_tools

If all looks well shutdown FreeNAS (you can now choose Shutdown Guest from VMware to safely power it off), remove the E1000 NIC and boot it back up (note that the IP address on the web gui will be different).

10.  Update FreeNAS

Before doing anything let’s upgrade FreeNAS to the latest stable under System Update.

This is a great time to make some tea.

Once that’s done it should reboot.  Then I always go back again and check for updates again to make sure there’s nothing left.

11. SSL Certificate on the Management Interface (optional)

On my DHCP server I’ll give FreeNAS a static/reserved IP, and setup an entry for it on my local DNS server.  So for this example I’ll have a DNS entry on my internal network for stor1.b3n.org.

If you don’t have your own internal Certificate Authority you can create one right in FreeNAS:

System, CAs, Create internal CA.  Increase the key length to 4096 and make sure the Digest Algorithm is set to SHA256.

freenas_create_internal_ca

Click on the CA you just created, hit the Export Certificate button, click on it to install the Root certificate you just created on your computer.  You can either install it just for your profile or for the local machine, I usually do local machine, and you’ll want to make sure to store it is in the Trusted Root Certificate Authorities store.

certificate_store

trusted_root_store

Just a warning, that you must keep this Root CA guarded, if a hacker were to access this he could generate certificates to impersonate anyone (including your bank) to initiate a MITM attack.

Also Export the Private Key of the CA and store it some place safe.

Now create the certificate…

System, Certificates, Create Internal Certificate.  Once again bump the key length to 4096.  The important part here is the Common Name must match your DNS entry.  If you are going to access FreeNAS via IP then you should put the IP address in the Common Name field.

freenas_create_certificate

System, Information.  Set the hostname to your dns name.

System, General.  Change the protocol to HTTPS and select the certificate you created.  Now you should be able to go to use https to access the FreeNAS WebGUI.

12. Setup Email Notifications

Account, Users, Root, Change Email, set to the email address you want to receive alerts (like if a drive fails or there’s an update available).

System, Advanced

Show console messages in the footer.  Enable (I find it useful)

System Email…

Fill in your SMTP server info… and send a test email to make sure it works.

13.  Setup a Proper Swap

FreeNAS by default creates a swap partition on each drive, and then stripes the swap across them so that if any one drive fails there’s a chance your system will crash.  We don’t want this.

System, Advanced…

Swap size on each drive in GiB, affects new disks only. Setting this to 0 disables swap creation completely (STRONGLY DISCOURAGED).   Set this to 0.

Open the shell.  This will create a 4GB swap file (based on https://www.freebsd.org/doc/handbook/adding-swap-space.html)

If you are on FreeNAS 9.10

System, Tasks, Add Init/Shutdown Script, Type=Command.  Command:

When = Post Init

swap_post_init

If you are on FreeNAS 9.3

System, Tunables, Add Tunable.

Variable=swapfile, Value=/usr/swap0, Type=rc.conf

swap_tunable

Back to Both:

Next time you reboot on the left Navigation pane click Display System Processes and make sure the swap shows up.  If so it’s working.

swap

14. Configure FreeNAS Networking

Setup the Management Network (which you are currently using to connect to the WebGUI).

Network, Interfaces, Add Interface, choose the Management NIC, vmx3f0, and set to DHCP.

management_nic

Setup the Storage Network

Add Interface, choose the Storage NIC, vmx3f1, and set to 10.55.1.2 (I setup my VMware hosts on 10.55.0.x and ZFS servers on 10.55.1.x), be sure to select /16 for the netmask.  And set the mtu to 9000.

storage_network2

Open a shell and make sure you can ping the ESXi host at 10.55.0.2

Reboot.  Let’s make sure the networking and swap stick.

15. Hard Drive Identification Setup

Label Drives.   FreeNAS is great at detecting bad drives, but it’s not so great at telling you which physical drive is having an issue.  It will tell you the serial number and that’s about it.  But how confident are you in knowing which drive fails?  If FreeNAS tells you that disk da3 (by the way, all these da numbers can change randomly) is having an issue how do you know which drive to pull?  Under Storage, View Disks, you can see the serial number, this still isn’t entirely helpful because chances are you can’t see the serial number without pulling a drive.  So we need to map them to slot numbers or labels of some sort.

disks

There are two ways you can deal with this.  The first, and my preference, is sas2ircu.  Assuming you connected the cables between the LSI 2308 and the backplane in proper sequence sas2ircu will tell you the slot number the drives are plugged into on the LSI controller.  Also if you’re using a backplane with an expander that supports SES2 it should also tell you which slots the drives are in.  Try running this command:

sas2ircu

You can see that it tells you the slot number and maps it to the serial number.  If you are comfortable that you know which physical drive each slot number is in then you should be okay.

If not, the second method, is remove all the drives from the LSI controller, and put in just the first drive and label it Slot 0 in the GUI by clicking on the drive, Edit, and enter a Description.

label_drive

labeled_drive

Put in the next drive in Slot 1 and label it, then insert the next drive and label it Slot 2 and so on…

The Description will show up in FreeNAS and it will survive reboots.  it will also follow the drive even if you move it to a different slot.  So it may be more appropriate to make your description match a label on the removable trays rather than the bay number.

It doesn’t matter if you label the drives or use sas2ircu, just make sure you’re confident that you can map a serial number to a physical drive before going forward.

16.1 Choose Pool Layout

For high performance the best configuration is to maximize the number of VDEVs by creating mirrors (essentially RAID-10).  That said, with my 6-drive RAID-Z2 array with 2 DC S3700 SSDs for SLOG/ZIL my setup performs very well with VMware in my environment.  If you’re running heavy random I/O mirrors are more important, but if you’re just running a handful of VMs RAID-Z / RAID-Z2 will probably offer great performance as long as you have a good SSD for SLOG device.   I like to start double parity at 5 or 6 disk VDEVs, and triple parity at 9 disks.  Here some some sample configurations:

Example zpool / vdev configurations

2 disks = 1 mirror
3 disks = RAID-Z
4 disks = RAID-Z or 2 mirrors
5 disks = RAID-Z, or RAID-Z2, or 2 mirrors with hot spare.
(Don’t configure 5 disks with 4 drives being in RAID-Z plus 1 hot spare–that’s just ridiculous.  Make it a RAID-Z2 to begin with).
6 disks = RAID-Z2, or 3 mirrors
7 disks = RAID-Z2, or 3 mirrors plus hot spare
8 disks = RAID-Z2, or 4 mirrors
9 disks = RAID-Z3, or 4 mirrors plus hot spare
10 disks = RAID-Z3, 2 vdevs of 5 disk RAID-Z2 or 5 mirrors
11 disks = RAID-Z3, 2 vdevs of 5 disk RAID-Z2 plus hot spare or 5 mirrors with hot spare
12 disks = 2 vdevs of 6 disk RAID-Z2, or 5 mirrors with 2 hot spares
13 disks = 2 vdevs of 6 disk RAID-Z2 plus hot spare or 5 mirrors with one hot spare
14 disks = 2 vdevs of 7 disk RAID-Z2 or 6 mirrors plus 2 hot spares
15 disks = 3 vdevs of 5 disk RAID-Z2 or 7 mirrors with 1 hot spare
16 disks = 3 vdevs of 5 disk RAID-Z2 plus hot spare or 7 mirrors with 2 hot spares
17 disks = 3 vdevs of 5 disk RAID-Z2 plus hot spares or 7 mirrors with 3 hot spares
18 disks = 2 vdevs of 9 disk RAID-Z3, 3 vdevs of 6 disk RAID-Z2 or 8 mirrors with 2 hot spares
19 disks = 2 vdevs of 9 disk RAID-Z3, 3 vdevs of 6 disk RAID-Z2 plus hot spares or 8 mirrors with 3 hot spares
20 disks = 2 vdevs of 10 disk RAID-Z3 4 vdevs of 5 disk RAID-Z2 plus hot spares or 9 mirrors with 2 hot spares

Anyway, that gives you a rough idea.  The more vdevs the better random performance.  It’s always a balance between capacity, performance, and safety.

16.2  Create the Pool.

Storage, Volumes, Volume Manager.

Click the + next to your HDDs and add them to the pool as RAID-Z2.

Click the + next to the SSDs and add them to the pool.  By default the SSDs will be on one row and two columns.  This will create a mirror.  If you want a stripe just add one Log device now and add the second one later.  Make certain that you change the dropdown on the SSD to “Log (ZIL)”  …it seems to lose this setting anytime you make any other changes so change that setting last.  If you do not do this you will stripe the SSD with the HDDs and possibly create a situation where any one drive failure can result in data loss.

create_pool_2

Back to Volume manager and add the second Log device…

add_zil

I have on numerous occasions had the Log get changed to Stripe after I set it to Log, so just double-check by clicking on the top level tank, then the volume status icon and make sure it looks like this:

good_tank

17.  Create an NFS Share for VMware

You can create either an NFS share, or iSCSI share (or both) for VMware.  First here’s how to setup an NFS share:

Storage, Volumes, Select the nested Tank, Create Data Set

Be sure to disable atime.

freenas_vmware_nfs

Sharing, NFS, Add Unix (NFS) Share.   Add the vmware_nfs dataset, and grant access to the storage network, and map the root user to root.

add_nfs_share

Answer yes to enable the NFS service.

In VMware, Configuration, Add Storage, Network File System and add the storage:

mount_nfs

And there’s your storage!

nfs_mounted

18.  Create an iSCSI share for VMware

WARNING: Note that at this time, based on some of the comments below with people having connection drop issues on iSCSI I suggest testing with heavy concurrent loads to make sure it’s stable.  Watch dmesg and /var/log/messages on FreeNAS for iSCSI timeouts.  Personally I use NFS.  But here’s how to enable iSCSI:

Storage, select the nested tank, Create Zvol.  Be sure compression is set to lz4.  Check Sparse Volume.  Choose advanced mode and optionally change the default block size.  I use 64K block-size based on some benchmarks I’ve done comparing 16K (the default), 64K, and 128K.  64K blocks didn’t really hurt random I/O but helped some on sequential performance, and also gives a better compression ratio.  128K blocks had the best better compression ratio but random I/O started to suffer so I think 64K is a good middle-ground.  Various workloads will probably benefit from different block sizes.

vmware_block

Sharing, Block (iSCSI), Target Global Configuration.

Set the base name to something sensible like: iqn.2011-03.org.b3n.stor1.istgt  Set Pool Available Space Threshold to 60%

iscsi _target_global

Portals tab… add a portal on the storage network.

iscsi_portal

Initiator.  Add Initiator.

add_initiator

Targets.  Add Target.

add_target_3

Extents.  Add Extent.

vmware_block_extent

Associated Targets.  Add Target / Extent.

add_target_extent

Under Services enable iSCSI.

In VMware Configutration, Storage Adapters, Add Adapter, iSCSI.

Select the iSCSI Software Adapter in the adapters list and choose properties.  Dynamic discovery tab.  Add…

add_iscsi_server

Close and re-scan the HBA / Adapter.

You should see your iSCSI block device appear…

iscsi_working

Configuration, Storage, Add Storage, Disk/LUN, select the FreeBSD iSCSi Disk,

block_device_in_vmware

19.  Setup ZFS VMware-Snapshot coordination.

This will coordinate with VMware to take clean snapshots of the VMs whenever ZFS takes a snapshot of that dataset.

Storage.  Vmware-Snapshot.  Add VMware-Snapshot.  Map your ZFS dataset to the VMware data store.

ZFS / VMware snapshots of NFS example.

vmware_snapshot_coordination_nfs

ZFS / VMware snapshots of iSCSI example.

vmware_snapshot_coordination_block

20. Periodic Snapshots

Add periodic snapshot jobs for your VMware storage under Storage, Periodic Snapshot Tasks.  You can setup different snapshot jobs with different retention policies.

snap_tasks

21. ZFS Replication

If you have a second FreeNAS Server (say stor2.b3n.org) you can replicate the snapshots over to it.  On stor1.b3n.org, Replication tasks, view public key. copy the key to the clipboard.

On the server you’re replicating to, stor2.b3n.org, go to Account, View Users, root, Modify User, and paste the public key into the SSH public Key field.  Also create a dataset called “replicated”.

Back on stor1.b3n.org:

Add Replication.  Do an SSH keyscan.

add_replication2

And repeat for any other datasets.  Optionally you could also just replicate the entire pool with the recursive option.

22.  Automatic Shutdown on UPS Battery Failure (Work in Progress)

The goal is on power loss, before the battery fails to shutdown all the VMware guests including FreeNAS.  So far all I have gotten is the APC working with VMware.  Edit the VM settings and add a USB controller, then add a USB device and select the UPS, in my case a APC Back-UPS ES 550G.  Power FreeNAS back on.

On the shell type:

dmesg|grep APC

This will tell you where the APC device is.  IN my case it’s showing up on ugen0.4.  I ended up having to grant world access to the UPS…

For some reason I could not get the GUI to connect to the UPS, I can selected ugen0.4, but under the drivers dropdown I just have hyphens —— … I set it manually in /usr/local/etc/nut/ups.conf

However, this file gets overwritten on reboot, and also the rc.conf setting doesn’t seem to stick.  I added this tunable to get the rc.conf setting…

nut_enable

And I created my ups.conf file in /mnt/tank/ups.conf.  Then I created a script to stop the nut service, copy my config file and restart the nut service in /mnt/tank/nutscript.sh

Then under tasks, Init/Shutdown Scripts I added a task to run the script post init.

post_init

Next step is to configure automatic shutdown of the VMware server and all guests on it…  I have not done this yet.

There’s a couple of approaches to take here.  One is to install a NUT client on the ESXi, and the other is to have FreeNAS ssh into VMware and tell it to shutdown.  I may update this section later if I ever get around to implementing it.

23. Backups

Before going live make sure you have adequate backups!  You can use ZFS replication with a fast link.  For slow network connections Rsync will work better (Took under Tasks -> Rsync tasks) or use a cloud service like CrashPlan.   Here’s a nice CrashPlan on FreeNAS Howto.

BACKUPS BEFORE PRODUCTION.  I can’t stress this enough, don’t rely on ZFS’s redundancy alone, always have backups (one offsite, one onsite) in place before putting anything important on it.

 Setup Complete… mostly.

Well, that’s all for now.

Best Hard Drives for ZFS Server (Updated 2016)

Today’s question comes from Jeff….

Q. What drives should I buy for my ZFS server? 

Answer: Here’s what I recommend, considering a balance of cost per TB, performance, and reliability.  I prefer NAS class drives since they are designed to run 24/7 and also are better at tolerating vibration from other drives.  I prefer SATA but SAS drives would be better in some designs (especially when using expanders).  For a home or small business storage server with 8 or fewer drives I think these are the best options.

Updated: July 19, 2015 – Added quieter HGST, and updated prices.
Updated: July 30, 2016 – Updated prices, and added WL drives

2TB HGST OEM Drives – $23/TB

They won’t carry the HGST 5-year warranty but you can usually get a 1-year warranty from the seller.  HGST drives are reliable so the lower cost probably justifies the lack of a warranty.  2TB HGST drives also boast a MTBF of 2 million hours!

HGST Deskstar 7K4000 2TB 64MB Cache 7200RPM SATA III (DS724020ALE640)  Desktop/NAS grade.  1-year Warranty.  $46 / $23/TB.  This Deskstar is nearly silent.  It isn’t an enterprise class drive, however the internals are nearly identical (and might be identical).  TLER is disabled by default but, unlike most desktop drives it can be enabled manually (see notes on enabling TLER belolw).

5TB, and 6TB White Label Drives $23/TB to $27/TB

A great way to save money is to get White Label Drives.  These are NAS class drives with branding removed.  Seller usually provides 1-year warranty.  A great deal is this 6TB WhiteLabel Drive for $160 or 5TB WhiteLabel Drive for $115. These are most likely re-branded Western Digital Reds.   I hate dealing with paperwork and warranty returns–I’d much rather just buy an extra drive to have sitting on the shelf in case one fails than pay more for a warranty.

3TB, 4TB, 5TB, 6TB, and 8TB Drives $37/TB to $50/TB

I’d purchase either HGST Deskstar NAS or the WD RED series.  Both are designed for 24-7 operation and for use in systems with up to 8-bays.

hgst_deskstar_nas

HGST Deskstar NAS 64MB Cache 7200RPM SATA III 3-year Warranty.  The main advantage of this drive is it’s faster at 7200RPM and as a result it significantly outperforms the WD Red.  See StorageReview’s benchmarks on the 4TB Deskstar.  Also at 5TB and 6TB the cache doubles to 128MB.  In general if the price is the same or pretty close I’d prefer the HGST drive.

 

WD RED NAS 64MB ~5400RPM SATA III 3-year Warrantywd_red.  The WD drive runs a little cheaper, if the price is less than the HGST by more than $5/TB I would consider this drive to save a little money.

Careful with 8TB+ drives

Some drives 8TB and larger are using SMR (Shingled Magnetic Recording) which should not be used with ZFS if you care about performance until drivers are developed.  Be careful about any drive that says it’s for archiving purposes.

SLOG and L2ARC

For SLOG and L2ARC see my comparison of SSDs.

Capcity Planning for Failure

Most drives running 24/7 start having a high failure rate after 3-years, you might be able to squeeze 4 or 5 years out of them if you’re lucky.  So a good rule of thumb is to estimate your growth and buy drives big enough that you will start to outgrow them in 4 to 5 years.  The price of hard drives is always dropping so you don’t really want to buy more much than you’ll need before they start failing.  Consider that in ZFS you shouldn’t run more than 70% full (with 80% being max) for your typical NAS applications including VMs on NFS.  But if you’re planning to use iSCSI you shouldn’t run more than 50% full.

ZFS Drive Configurations

My preference is almost always RAID-Z2 (RAID-6) with 6 to 8 drives which provides a storage efficiency of .66 to .75.  This scales pretty well as far as capacity is concerned and with double-parity I’m not that concerned if a drive fails.  6 drives in RAID-Z2 would net 8TB capacity all the way up to 24TB with 6TB drives.  For larger setups use multiple vdevs.  E.g. with 60 bays use 10 six drive RAID-Z2 vdevs (each vdev will increase IOPS).  For smaller setups I run 3 or 4 drives in RAID-Z (RAID-5).  In all cases it’s essential to have backups… and I’d rather have two smaller servers with RAID-Z mirroring to each other than one server with RAID-Z2.  The nice thing about smaller setups is the cost of upgrading 4 drives isn’t as bad as 6 or 8!

Enabling CCTL/TLER

Time-Limited Error Recovery (TLER) or Command Completion Time Limit (CCTL).

On desktop class drives such as the HGST Deskstar, they’re typically not run in RAID mode so by default they are configured to take as long as needed (sometimes several minutes) to try to recover a bad sector of data.  This is what you’d want on a desktop, however performance grinds to a halt during this time which can cause your ZFS server to hang for several minutes waiting on a recovery.  If you already have ZFS redundancy it’s a pretty low risk to just tell the drive to give up after a few seconds, and let ZFS rebuild the data.

The basic rule of thumb.  If you’re running RAID-Z, you have two copies so I’d be a little cautious about enabling TLER.  If you’re running RAID-Z2 or RAID-Z3 you have three or four copies of data so in that case there’s very little risk in enabling it.

Viewing the TLER setting:

Enabling TLER

Disabling TLER

(TLER should always be disabled if you have no redundancy).

FreeNAS vs. OmniOS / Napp-It

freenas      OmniOS_logo_200px

2015-01-07: I’ve updated this post to to reflect changes in FreeNAS 9.3. 

I’ve been using OpenIndiana since late 2011, and switched to OmniOS in 2013.  Lately I started testing FreeNAS, what drove me to do this is I use  CrashPlan to backup my pool but recently Code 42 announced they’ll be discontinuing Solaris support for Crashplan so I needed to start looking for an alternative OS or an alternative backup solution.  I decided to look at FreeNAS because it has a CrashPlan plugin that runs in a jail using Linux emulation.  After testing it out for awhile I am likely going to stay on OmniOS since it suits my needs better and instead switch out CrashPlan for ZnapZend for my backup solution.  But after running FreeNAS for a few months here are my thoughts on both platforms and their strengths and weaknesses as a ZFS storage server.

Update: 2015-01-07: After a lot of testing ZnapZend ended up not working for me, this is not it’s fault, but because I have limited and bandwidth the snapshots don’t catch up and it gets further and further behind so for now I’m continuing with Crashplan on OmniOS.  I am also testing FreeNAS and may consider a switch at some point.

CIFS / SMB Performance for Windows Shares

FreeNAS has a newer implementation of SMB, supporting SMB3, I think OmniOS is at SMB1.  FreeNAS can actually function as an Active Directory Domain Controller.

OmniOS is slightly faster, writing a large file over my LAN gets around 115MBps vs 98MBps on FreeNAS.  I suspect this is because OmniOS runs NFS SMB at the kernel level and FreeNAS runs it in user space.  I tried changing the FreeNAS protocol to SMB2, and even SMB1 but couldn’t get past 99MBps.  This is on a Xeon E3-1240V3 so there’s plenty of CPU power, Samba on FreeNAS just can’t keep up.

CIFS / SMB Snapshot Integration with Previous Versions

Previous Versions Snapshot Integration with Windows is far superior in OmniOS.   I always use multiple snapshot jobs to do progressive thinning of snapshots.  So for example I’ll setup monthly snaps with a 6 month retention, weekly with two month retention, daily with two week, hourly with 1 week, and every 5 minutes for two days.   FreeNAS will let you setup the snap jobs this way, but in Windows Previous Versions it will only show the snapshots from one of the snap jobs under Previous Versions (so you may see your every 5 minute snaps but you can’t see the hourly or weekly snaps).  OmniOS handles this nicely.  As a bonus Napp-It has an option to automatically delete empty snapshots sooner than their retention expiration so I don’t see them in Previous Versions unless some data actually changed.

previous_versions

Enclosure Management

Both platforms struggle here, FreeNAS has a bit of an edge here… probably the best thing to do is write down the serial number of each drive with the slot number.  In FreeNAS drives are given device names like da0, da1, etc. but unfortunately the numbers don’t seem to correspond to anything and they can even change between reboots.  FreeNAS does have the ability to label drives so you could insert one drive at a time and label them with the slot they’re in.

OmniOS drives are given names like c3t5000C5005328D67Bd0 which isn’t entirely helpful.

For LSI controllers the sas2irc utility (which works on FreeBSD or Solaris) will map the drives to slots.

Fault Management

The ZFS fault management daemon will automatically replace a failed drive with a hot spare… but it hasn’t been ported to FreeBSD yet so FreeNAS really only has warm spare capability.  Update: FreeNAS added hot spare capability on Feb 27, 2015.   To me this is a minor concern… if you’re going to use RAID-Z with a hot spare why not just configure the pool with RAID-Z2 or RAID-Z3 to begin with?  However, I can see how the fault management daemon on OmniOS would reduce the amount of work if you had several hundred drives and failures were routine.

SWAP issue on FreeNAS

While I was testing I actually had a drive fail (this is why 3-year old Seagate drives are great to test with) and FreeNAS crashed!  The NFS pool dropped out from under VMware.  When I looked at the console I saw “swap_pager: I/O error – pagein failed”   I had run into FreeNAS Bug 208 which was closed a year ago but never resolved.  The default setting in FreeNAS is to create a 2GB swap partition on every drive which acts like striped swap space (I am not making this up, this is the default setting).  So if any one of the drives fails it can take FreeNAS down.  The argument from FreeNAS is that you shouldn’t be using swap–and perhaps that’s true but I had a FreeNAS box with 8GB memory and running only one jail with CrashPlan bring my entire system down because a single drive failed.  That’s not an acceptable default setting.  Fortunately there is a way to disable automatically creating swap partitions on FreeNAS, it’s best to disable the setting before initializing any disks.

In my three years of running an OpenSolaris / Illumos based OS I’ve never had a drive failure bring the system down

Running under VMware

FreeNAS is not supported running under a VM but OmniOS is.  In my testing both OmniOS and FreeNAS work well under VMware under the best practices of passing an LSI controller flashed into IT mode to the VM using VT-d.  I did find that OmniOS does a lot better virtualized on slower hardware than FreeNAS.  On an Avaton C2750 FreeNAS performed well on bare metal, but when I virtualized it using vmdks on drives instead of VT-d FreeNAS suffered in performance but OmniOS performed quite well under the same scenario.

Both platforms have VMXNET3 drivers, neither has a Paravirtual SCSI driver.

Encryption

Unfortunately Oracle did not release the source for Solaris 11, so there is no encryption support on OpenZFS directly.

FreeNAS can take advantage of FreeBSD’s GELI based encryption.  FreeBSD’s implementation can use the AES instruction set, last I tested Solaris 11 the AES instruction set was not used so FreeBSD/FreeNAS probably has the fastest encryption implementation for ZFS.

There isn’t a good encryption option on OmniOS.

ZFS High Availability

Neither systems supports ZFS high availability out of the box.  OmniOS can use a third party tool like RSF-1 (paid) to accomplish this.  The commercially supported TrueNAS uses RSF-1 so it should also work in FreeNAS.

ZFS Replication & Backups

FreeNAS has the ability to easily setup replication as often as every 5 minutes which is a great way to have a standby host to failover to.  Replication can be done over the network.  If you’re going to replicate over the internet I’d say you want a small data set or a very fast connection–I ran into issues a couple of times where the replication got interrupted and it needed to start all over from scratch.  On OmniOS Napp-It does not offer a free replication solution, but there is a paid replication feature, however there are also numerous free ZFS replication scripts that people have written such as ZnapZend.

I did get the CrashPlan plugin to work under FreeNAS, however I found that after a reboot the CrashPlan jail sometimes wouldn’t auto-mount my main pool so it ended up not being a reliable enough solution for me to be comfortable with.  I wish FreeNAS made it so that it wasn’t in a jail.

Memory Requirements

FreeNAS is a little more power hungry than OmniOS.  For my 8TB pool a bare minimum for FreeNAS is 8GB while OmniOS is quite happy with 4GB, although I run it with 6GB to give it a little more ARC.

Hardware Support

FreeNAS supports more hardware than OmniOS.  I generally virtualize my ZFS server so it doesn’t matter too much to me but if you’re running bare metal and on obscure or newer hardware there’s a much better chance that FreeNAS supports it.  Also in 9.3 you have the ability to configure IPMI from the web interface.

VAAI (VMware vSphere Storage API’s — Array Integration)

FreeNAS now has VAAI support for iSCSI.  OmniOS has no VAAI support.  As of FreeNAS 9.3 and Napp-It 0.9f4 both control panels have the ability to enable VMware snapshot integration / ESXi hot snaps.  The way this works is before every ZFS snapshot is taken FreeNAS has VMware snap all the VMs, then the ZFS snapshot is taken, then the VMware snapshots are released.  This is really nice and allows for proper consistent snapshots.

 

GUI

The FreeNAS GUI looks a little nicer and is probably a little easier for a beginner.  The background of the screen turns red whenever you’re about to do something dangerous.  I found you can setup just about everything from the GUI, where I had to drop into the command line more often with OmniOS.  The FreeNAS web interface seems to hang for a few seconds from time to time compared to Napp-It, but nothing major.  I believe FreeNAS will have an asynchronous GUI in version 10.

One frustration I have with FreeNAS is it doesn’t quite do things that are compatible with CLI.  For example, if you create a pool via CLI FreeNAS doesn’t see it, you actually have to import it using the GUI to use it there.  Napp-it is essentially an interface that runs CLI commands so you can seamlessly switch back and forth between managing things on CLI and Napp-It.  This is a difference in philosophy.   Napp-It is just a web interface meant to run on top of an OS, where FreeNAS is more than just a webapp on top of FreeBSD, FreeNAS is it’s own OS.

I think most people experienced with the zfs command line and Solaris are going to be a little more at home with Napp-It’s control panel, but it’s easy enough to figure out what FreeNAS is doing.  You just have to be careful what you do in the CLI.

On both platforms I found I had to switch into CLI from time to time to do things right (e.g. FreeNAS can’t set sync=always from the GUI, Napp-It can’t setup networking).

As far as managing a ZFS file system both have what I want.. email alerts when there’s a problem, scheduling for data scrubs, snapshots, etc.

FreeNAS has better security, it’s much easier to setup an SSL cert on the management interface, in fact you can create an internal CA to sign certificates from the GUI.  Security updates are easier to manage from the web interface in FreeNAS as well.

Community

FreeNAS and OmniOS both have great communities.  If you post anything at HardForum chances are you’ll get a response from Gea and he’s usually quite helpful.  Post anything on the FreeNAS forums and Cyberjock will tell you that you need more RAM and that you’ll lose all your data.  There is a lot of info on the FreeNAS forums and the FreeNAS Redmine project is open so you can see all the issues, it’s great way to see what bugs and feature requests are out there and when they were or will be fixed.  OmniOS has an active OmniOS Discuss mailman list and Gea, the author of Napp-It is active on various forums.  He has answered my questions on several occasions over at HardForum’s Data Storage subforum.  I’ve found the HardForum community a little more helpful…I’ve always gotten a response there while several questions I posted on the FreeNAS forums went unanswered.

Documentation

FreeNAS documentation is great, like FreeBSD’s.  Just about everything is in the FreeNAS Guide

OmniOS isn’t as organized.  I found some howtos here, not nearly as comprehensive as FreeNAS.  Most of what I find from OmniOS I find in forums or the Napp-It site.

Mirrored ZFS boot device / rpool

OmniOS can boot to a mirrored ZFS rpool.

FreeNAS does not have a way to mirror the ZFS boot device.  FreeBSD does have this capability but it turns out FreeNAS is built on NanoBSD.  The only way to get FreeNAS to have redundancy on the boot device that I know of is to set it up on a hardware RAID card.

FreeNAS 9.3 can now install to a mirrored ZFS rpool!

Features / Plugins / Extensions

Napp-It’s extensions include:

  • AMP (Apache, MySQL, PHP stack)
  • Baikal CalDAV / CardDAV Server
  • Logitech MediaServer
  • MediaTomb (DLNA / UPnP server)
  • Owncloud (Dropbox alternative)
  • PHPvirtualbox (VirtualBox interface)
  • Pydio Sharing
  • FTP Server
  • Serviio Mediaserver
  • Tine Groupware

FreeNAS plugins:

  • Bacula (Backup Server)
  • BTSync (Bittorrent Sync)
  • CouchPotato (NZB and Torrent downloader)
  • CrashPlan (Backup client/server)
  • Cruciblewds (Computer imaging / cloning)
  • Firefly (media server for Roku SoundBridge and Apple iTunes)
  • Headphones (automatic music downloader for SABnzbd)
  • HTPC-Manager
  • LazyLibrarian (follow authors and grab metadata for digital reading)
  • Maraschino (web interfrace for XBMC HTPC)
  • MineOS (Minecraft control panel)
  • Mylar (Comic book downloader)
  • OwnCloud (Dropbox alternative)
  • PlexMediaServer
  • s3cmd
  • SABnzbd (Binary newsreader)
  • SickBeard (PVR for newsgroup users)
  • SickRage (Video file manager for TV shows)
  • Subsonic (music streaming server)
  • Syncthing (Open source cluster synchronization)
  • Transmission (BitTorrent client)
  • XDM (eXtendable Download Manager)

All FreeNAS plugins run in a jail so you must mount the storage that service will need inside the jail… this can be kind of annoying but it does allow for some nice security–for example CrashPlan can mount the storage you want to backup as read-only.

Protocols and Services

Both systems offer a standard stack of AFP, SMB/CIFS, iSCSI, FTP, NFS, RSYNC, TFTP
FreeNAS also has WebDAV and few extra services like Dynamic DNS, LLDP, and UPS (the ability to connect to a UPS unit and shutdown automatically).

Performance Reporting and Monitoring

Napp-It does not have reports and graphs in the free version.  FreeNAS has reports and you can look back as far as you want to see historical performance metrics.

freenas_stats

As a Hypervisor

Both systems are very efficient running guests of the same OS.  OmniOS has Zones, FreeNAS can run FreeBSD Jails.  OmniOS also has KVM which can be used to run any OS.  I suspect that FreeNAS 10 will have Bhyve.  Also both can run VirtualBox.

Stability vs Latest

Both systems are stable, OmniOS/Napp-It seems to be the most robust of the two.  The OmniOS LTS updates are very minimal, mostly security updates and a few bug fixes.  Infrequent and minimal updates are what I like to see in a storage solution.

FreeNAS is pushing a little close to the cutting edge.  They have frequent updates pushed out–sometimes I think they are too frequent to have been thoroughly tested.  On the other hand if you come across an issue or feature request in FreeNAS and report it chances are they’ll get it  in the next release pretty quickly.

Because of this, OmniOS is behind FreeNAS on some things like NFS and SMB protocol versions, VAAI support for iSCSI, etc.

I think this is an important consideration.  With FreeNAS you’ll get newer features and later technologies while OmniOS LTS is generally the better platform for stability.  The commercial TrueNAS soltution is also going to robust.  For FreeNAS you could always pick a stable version and not update very often–I really wish FreeNAS had an LTS, or at least a slower moving stable branch that maybe only did quarterly updates except for security fixes.

ZFS Integration

OmniOS has a slight edge on ZFS integration.  As I mentioned earlier OmniOS has multi-tiered snapshot integration into the the Windows Previous versions feature where FreeNAS can only pick one snap-frequency to show up there.  Also, in OmniOS NFS and SMB shares are stored as properties on the datasets so you can export the pool, import it somewhere else and the shares stay with the pool so you don’t have to reconfigure them.

Commercial Support

OmniOS offers Commercial Support if you want it.

iX Systems offers supported TrueNAS appliances.

Performance

On an All-in-one setup, I setup VMware ESXi 6.0, a virtual storage network and tested FreeNAS and OmniOS using iSCSI and NFS.  On all tests MTU is set to 9000 on the storage network, and compression is set to LZ4.  iSCSI volumes are sparse ZVOLs.  I gave the ZFS server 2 cores and 8GB memory, and the guest VM 2 cores and 8GB memory.  The guest VM is Windows 10 running Crystal Benchmark.

Environment:

  • Supermicro X10SL7-F with LSI 2308 HBA flashed to IT firmware and passed to ZFS server via VT-d (I flashed the P19 firmware for OmniOS and then re-flashed to P16 for FreeNAS).
  • Intel Xeon E3-1240v3 3.40Ghz.
  • 16GB ECC Memory.
  • 6 x 2TB Seagate 7200 drives in RAID-Z2
  • 2 x 100GB DC S3700s striped for ZIL/SLOG.  Over-provisioned to 8GB.

Latest stable updates on both operating systems:

FreeNAS 9.3  update 201503200528
OmniOS r151012 omnios-10b9c79

On Crystal Benchmark I ran 5 each of the 4000MB, 1000MB, and 50MB size tests, the results are the average of the results.

On all tests every write was going to the ZIL / SLOG devices.  On NFS I left the default sync=standard (which results in every write being a sync with ESXi).  On iSCSI I set sync=always, ESXi doesn’t honor sync requests from the guest with iSCSI so it’s not safe to run with sync=standard.

Update: 7/30/2015: FreeNAS has pushed out some updates that appear to improve NFS performance.  See the newer results here: VMware vs bhyve Performance Comparison.  Original results below.

Sequential Read MBps

seqrd

Sequential Write MBps

seqwr2

Random Read 512K MBps

rndrd512

Random Write 512K MBps

rndwr512

Random Read 4K IOPS

randrd4kqd1

Random Write 4K IOPS

rndwr4kqd1

Random Read 4K QD=32 IOPS

rdndrd4kqd32

Random Write 4K QD=32 IOPS

rndwr4kqd32

Performance Thoughts

So it appears, from these unscientific benchmarks that OmniOS on NFS is  the fastest configuration, iSCSI performs pretty similarly on both FreeNAS and OmniOS depending on the test.  One other thing I should mention, which doesn’t show up in the tests is latency.  With NFS I saw latency on the ESXi storage as high as 14ms during the tests,  while latency never broke a millisecond with iSCSI.

One major drawback to my benchmarks is it’s only one guest hitting the storage. It would be interesting to repeat the test with several VMs accessing the storage simultaneously, I expect the results may be different under heavy concurrent load.

I chose 64k iSCSI block size because the larger blocks result in a higher LZ4 compression ratio, I did several quick benchmarks and found 16K and 64K performed pretty similarly, 16K did perform better at random 4K write QD=1, but otherwise 64K was close to a 16K block size depending on the test.   I saw significant drop in random performance at 128k.  Once again under different scenarios this may not be the optimal block-size for all types of workloads.