Benchmarking Guest on FreeNAS ZFS, bhyve and ESXi

FreeNAS 11 introduces a GUI for FreeBSD’s bhyve hypervisor.  This is a potential replacement for the ESXi + FreeNAS All-in-One “hyper-converged storage” design.

Hardware

Hardware is based on my Supermicro Microserver Build

  • Xeon D-1518 (4 physical cores, 8 threads) @ 2.2GHz
  • 16GB DDR4 ECC memory
  • 4 x 2TB HGST RAID-Z, 100GB Intel DC S3700s for ZIL (over-provisioned at 8GB) on an M1015.  In Environments 1 and 2 this was passed to FreeNAS via VT-d.
  • 2 x Samsung FIT USBs for booting OS (either ESXi or FreeNAS)
  • 1 x extra DC S3700 used as ESXi storage for the FreeNAS VM to be installed on in environments 1 and 2 (not used in environment 3).

Environments

E1. ESXi + FreeNAS 11 All-in-one.

Setup per my FreeNAS on VMware Guide.  Ubuntu VM with Paravirtual is installed as an ESXi guest, on NFS storage backed by ZFS on FreeNAS which has raw access to disks running under the same ESXi hypervisor using virtual networking.  FreeNAS given 2 cores and 10GB memory.  Guest gets 1GB memory.  Guest tested with 1C and 2C.

E2. Nested bhyve + ESXi + FreeNAS 11 All-in-one.

Nested virtualization test.  Ubuntu VM with VirtIO is installed as a bhyve guest on FreeNAS which has raw access to disks running under the ESXi Hypervisor.  FreeNAS given 4 cores and 12GB memory.  Guest gets 1GB memory.  Guest tested with 1C and 2C.  What is neat about this environment is it could be used as a stepping stone if migrating from environment 1 to environment 3 or vice-versa (I actually tested migrating with success).

E3. bhyve + FreeNAS 11

Ubuntu VM with VirtIO is installed as a bhyve guest on FreeNAS on bare metal.  Guest gets 1GB memory.  Guest was backed with a ZVOL since that was the only option.  Tested wih 1C and 2C.

All environments used FreeNAS 11, E1 and E2 used VMware ESXi 6.5

Testing Notes

A reboot of the guest and FreeNAS was performed between each test so as to clear ZFS’s ARC (in memory read cache).  The sysbench test files were recreated at the start of each test.  The script I used for testing is https://github.com/ahnooie/meta-vps-bench with networking tests removed.

No attempts on tuning were made in any environment.  Just used the sensible defaults.

Disclaimer on comparing Apples to Oranges

This is not a business or enterprise level comparison.  This test is meant to show how an Ubuntu guest performs in various configurations on the same hardware with constraints of a typical budget home server running a free “hyperconverged” solution–a hypervisor and FreeNAS storage on the same physical box.  Not all environments are meant to perform identically…my goal is just to see if the environments perform “good enough” for home use.  An obvious example of this is environments using NFS backed storage are going to perform slower than environments with local storage… but it should still at the very least max out a 1Gbps ethernet.  This set of tests is designed to benchmark how I would setup each environment given the constraint of one physical box running both the hypervisor and FreeNAS + ZFS as the storage backend.  The test is limited to a single guest VM.  In the real world dozens, if not hundreds or even thousands of VMs are running simultaneously so advanced hypervisor features like memory deduplication are going to make a big difference.  This test made no attempt to benchmark such.  This is not an apples to apples test, so be careful what conclusions you derive from it.

CPU 1 and 2 threaded test

I’d say these are equivalent, which probably shows how little overhead there is from the hypervisor these days, though nested virtualization is a bit slower.

CPU 1 and 2 threaded

CPU 4 threaded test

Good to see that 2 cores actually performs faster than 1 core on a 4 threaded test.  Nothing to see here…

CPU 4 threads

Memory Operations Per Second

Horrible performance with nested, but with the hypervisor on bare metal ESXi and bhyve performed identically.

Memory OPS

Memory MB/s

Once again nested virtualization was slow.. other than that neck and neck performance.

Memory Test

OLTP Transactions Per Second

The ESXi environment clearly takes the lead over bhyve, especially as the number of  cores / threads started increasing.  This is interesting because ESXi outperforms despite an I/O penalty from using NFS so ESXi is more than making up for that somewhere else.

OLTP Test

Disk I/O Requests per Second

Clearly there’s an advantage to using local ZFS storage vs NFS.  I’m a bit disappointing in the nested virtualization performance since from a storage standpoint it should be equivalent to bare metal FreeNAS, but may be due to the slow memory performance in that environment.

Disk Ransom I/O

Disk Sequential Read/Write MBps

No surprises, ZFS local storage is going to outperform NFS

Disk Sequential I/O Well there you have it.  I think it’s safe to say that bhyve is a viable solution for home (although I would like to see more people using it in the wild before considering it robust–I imagine we’ll see more of that now that FreeNAS has a UI for it).  For low resource VMs E2 (nested virtualization)  is a way to migrate between E1 and E3–but it’s not going to work for high performance VMs because of the memory performance hit.

FreeNAS Corral on VMware 6.5

This guide will install FreeNAS 10 (Corral) under VMware 6.5 ESXi, then via NFS share ZFS backed storage back to VMware.  This is an update of my FreeNAS 9.10 on VMware 6.0 Guide.

“Hyperconverged” Design Overview

FreeNAS Vmware

FreeNAS is installed as a Virtual Machine on the VMware Hypervisor.  An LSI HBA in IT Mode is passed to FreeNAS via VT-d Passthrough.  A ZFS pool is created on the disks attacked to the HBA.  ZFS provides RAID-Z redundancy and an NFS dataset is then shared from FreeNAS and mounted from VMware which is used to provide storage for the remaining guests.  Optionally containers and VM guests can run directly on FreeNAS itself using bhyve.

FreeNAS Corral

FreeNAS 10 (now called FreeNAS Corral) is a major rewrite over FreeNAS 9.10, the GUI has been overhauled, it has a CLI interface, and an API.  I think the best feature is the bhyve hypervisor and docker support.  To some degree for a single all-in-one hypervisor+NAS server you may not even need VMware and be able to get away with bhyve and docker.

FreeNAS Corral Dashboard

Like anything new I advise caution against running it in a production environment.  I do see quite a few rough edges and a few missing features that are available in FreeNAS 9.10.  I imagine we’ll see frequent updates with polishing and features added.  A good rule of thumb is to wait until TrueNAS hardware is shipping with the “Corral” version.   I think this is the best release of FreeNAS yet, and it is going to be a great platform moving forward!

1. Get Hardware

This is based on my Supermicro X10SDV Build.  For drives I used 4 x White Label NAS class HDDs (see ZFS Hard Drive Guide) and two Intel DC S3700s (similar models between S3500 and S3720 should be fine), which often show up for a decent price on Ebay.  One SSD will be used to boot VMware and provide the initial data storage and the other used as a ZIL.

You will need an HBA to pass storage to the FreeNAS guest.  I suggest the ServerRAID IBM M1015 flashed to IT mode, or you can usually find the LSI 9210-8i already flashed to IT mode for a decent price on eBay.  You will also need a Mini-SAS to 4x SAS SATA Forward Breakout Cable.

2. IPMI Setup

Go ahead and plug in the network cables to the IPMI management port, as well as at least one of the normal ethernet ports.

This should work with just about any server class Supermicro board…. first download the Supermicro IPMIView tool (I just enter “Private” for the company).  Once installed run “IPMIView20” from the Start Menu (you may need to run it as Administrator).

Scan for IPMI Devices… once it finds your Supermicro server select it and Save.

Login to IPMI using ADMIN / ADMIN (you’ll want to change that obviously).

IPMI Login

KVM Console Tab…

KVM Console Tab

Load the VMware ISO file to the Virtual DVD-ROM drive…

Download the VMware ESXi Free Hypervisor.

Select ISO file, Open Image, select the VMware ISO file which you can download here, and then hit “Plug In”

KVM Virtual Storage

Power on

KVM Power On

Hit Delete repeatedly…

KVM Boot

Change the boot order, I made the ATEN Virtual CD/DVD the primary boot devices, and my Intel SSD DC S3700 that I’ll install VMware to secondary, and disabled everything else.

BIOS Boot Order

Save and Exit, and it should boot the VMware installer ISO.

3. Install VMware ESXi 6.5.0

Install ESXi

VMware Installer

Install to the Intel SSD Drive.

VMware Install Select Drive

Once installation is complete “Plug Out” the Virtual ISO file before rebooting.

Unplug ISO file

Once it comes up get the IP address (or set it if you want it to have a static IP which I highly recommend).

VMware screen

4. PCI Passthrough HBA

Go to that address in your browser (I suggest Chrome).  Manage, Hardware, PCI Devices, select the LSI HBA card and Enable Passthrough.

Passthrough LSI HBA

Reboot

5. Setup VMware Storage Network

In the examples below my LAN / VM Network is on 10.2.0.0/16 Final Portgroups(255.255.0.0) and my Storage network is on 10.55.0.0/16.  You may need to adjust for your network.  My storage network is on VLAN 55.

I like to keep my Storage Network separate from my LAN / VM Network.  So we’ll create a VM Storage Network portgroup with a VLAN ID of 55.

Networking, Port groups, Add Port Group

Add Port Group

Add VM Storage Network with VLAN ID of 55.

(you can choose a different VLAN ID, my storage network is 10.55.0.0/16 so I use “55” to match the network so that I don’t have to remember what VLAN goes to what network, but it doesn’t have to match).

Add a second port group just like it called Storage Network with the same VLAN ID (55).

Storage Network

Add VMKernel NIC

VMKernel NIC

Attach it to the Storage Network and give it an address of 10.55.0.4 with a netmask of 255.255.0.0

VMKernel Storage

You should end up with this…

6. Create a FreeNAS Corral VM

Create VM

FreeBSD (64-bit)

Create VM

Install it to the DC S3700 Datastore that VMware is installed on.

Add PCI Device and Select your LSI Card.

Select HBA PCI

Add a second NIC for the VM Storage Network.  You should have two NICS for FreeNAS, a VM Network and a VM Storage Network and you should set the Adapter Type to VMXNET 3 on both.

VMXNET3

Add NIC

I usually give my FreeNAS VM 2 cores, if doing anything heavy (especially if you’ll be running docker images or bhyve under it you may want to increase that count).  One rule with VMware is do not give VMs more cores than they need.  I usually give each VM one core and only consider more if that particular VM needs more resources.  This will reduce the risk of CPU co-stops from occurring.  Gabrie van zanten’s How too many vCPUs can negatively affect performance is a good read.

2 Cores

ZFS needs memory.  FreeNAS 10 needs 8GB memory minimum.  Lock it.

Made the Hard Disk VMDK 16GB.  There’s an issue with the VMware 6.5 SCSI controller on FreeBSD/FreeNAS.  You’ll know it if you see an error like:

UNMAP failed. disabling BIO_DELETE
UNMAP CDB: 42 00 00 00 00 00 00 00 18 000.
CAM status: SCSI Status Error.
SCSI status: Check Condition.
SCSI sense: ILLEGAL REQUEST asc:26,0 (Invalid field in parameter list).
Command byte 0 is invalid.
Error 22, Unretryable error.

To prevent this, change the Virtual Device Node on the hard drive to SATA controller 0, and SCSI Controller 0 should be LSI Logic SAS

SATA Controller

Add CD/DVD Drive, under CD/DVD Media hit Browse to upload and select the FreeNAS Corral ISO file which you can download from FreeNAS.

Add CD-ROM

7. Install FreeNAS VM

Power on the VM…

Select the VMware disk to install to.  I should note that if you create two VMDKs you can select them both at this screen and it will create a ZFS boot mirror, if you have an extra hard drive you can create another VMware data store there and put the 2nd vmdk there.  This would provide some extra redundancy for the FreeNAS boot pool.  In my case I know the DC S3700s are extremely reliable, and if I lost the FreeNAS OS I could just re-import the pool or failover to my secondary FreeNAS server.

Install FreeNAS to VMDK

Boot via BIOS.

Once FreeNAS is installed reboot and you should get the IP from DHCP on the console (once again I suggest setting this to a static IP).

If you hit that IP with a browser you should have a login screen!

8. Update and Reboot

Before doing anything…. System, Updates, Update and Reboot.

Update

(Note: to get better insight into a task progress head over to the Console and type: task show).

9. Setup SSL Certificate

First, set your hostname, and also create a DNS entry pointing at the FreeNAS IP.

Create Internal CA

Export Certificate

Untar the file and click the HobbitonCA.crt to install it, install it to the trusted Root Certificate Authorities.  I should note that if someone were to compromise your CA or gain the key they could do a MITM attack on you forging SSL certificates for other sites.

Create a Certificate for FreeNAS

Create Certificate

Listen on HTTP+HTTPS and select the Certificate.  I also increase the token Lifetime since I religiously lock my workstation when I’m away.

Listen on HTTPS

And now SSL is Secured

SSL Secured

 

10. Create Pool

Do you want Performance, Capacity, or Redundancy?  Drag the white circle thing where you want on the triangle and FreeNAS will suggest a zpool layout.  With 4 disks I chose “Optimal” and it suggested RAID-Z which is what I wanted.  Be sure to add the other SSD as a SLOG / ZIL / LOG.

Pool Creation

11. Create Users

It’s probably best not to be logging in as root all the time.  Create some named users with Administrator access.

12. Create Top Level Dataset

I like to create a top level dataset with a unique name for each FreeNAS server, that way it’s easier to replicate datasets to my other FreeNAS servers and perform recursive tasks (such as snapshots, or replication) on that top level dataset without having to micromanage them.  I know you can sometimes do recursive tasks on the entire pool, but oftentimes I want to exclude certain datasets from those tasks (such as if those datasets are being replicated from another server).

If you’d like to see more on my reasoning for using a top level dataset see my ZFS Dataset Hierarcy

Storage, tank3, Datasets, New…

Top Level DataSet

13. Setup Samba

Services, Sharing, SMB, set the NetBIOS name and Workgroup and Enable.

Storage, SMB3, Share, to create a new dataset with a Samba Share.  Be sure to set the ownership to a user.

SMB Share

14. Setup NFS Share for VMware

I believe at this time VMware and FreeNAS don’t work together on NFSv4, so best to stick to NFSv3 for now.

NFS Share for VMware

Mount NFS Store in VMware by going to Storage, Datastores, new datastore, Mount NFS datastore.

NFS Mount

15. Snapshots

I setup automatic recursive snapshots on the top level dataset.  I like to do pruning snapshots like this:

every 5 minutes -> keep for 2 hours
every hour -> keep for keep for 2 days
every day -> keep for 1 week
every week -> keep for 4 weeks
every 4 weeks -> keep for 12 weeks

And SAMBA has Previous Versions integration with ZFS Snapshots, this is great for letting users restore their own files.

SMB ZFS Integration

16. ZFS Replication to Backup Server

Before putting anything into production setup automatic backups.  Preferably one onsite and one offsite.

Peering, New FreeNAS, and enter the details for your secondary FreeNAS server.

FreeNAS Peering

 

Now you’ll see why I created a top level dataset under the pool….

Storage, Tank3, Replications, New, select the stor2.b3n.org Peer, source dataset is your top level dataset, tank3/ds4, and target dataset is tank4/ds4 on the backup FreeNAS server.

Compression should be FAST over a LAN or BEST over a low WAN.

FreeNAS Replication

Go to another menu option and then back to Storage, tank3, Replications, replication_ds4, and Start the replication and check back in a couple hours to make sure it’s working.  My first replication attempt hung, so I canceled the task and started it again.  I also found that adjusting the peer interval from 1 minute to 5 seconds under Peering may have helped.

FreeNAS Notifications

16.1 Offsite Backups

It’s also a good idea to have Offsite backups, you could use S3, or a CrashPlan Docker Container, etc.

17. Setup Notifications

You want to be notified when something fails.  FreeNAS can be configured to send an email or sent out Pushbullet notifications.  Here’s how to setup Pushbullet.

Create or Login to your Pushbullet account.  Settings, Account, Create an Access Token

PushBullet Access Token

Services, Alerts & Reporting, Add the access key (bottom right) and configure the alerts to send out via Pushbullet.

PushBullet Setup

You can use the Pushbullet Chrome extension or Android/iOS apps to receive alerts.

18. bhyve VMs and Docker Containers under FreeNAS under VMware

Add another Port Group on your VM Network which allows Promiscuous mode, MAC address changes, and Forged transmits.  You can connect FreeNAS and any VMs you really trust to this port group.

Trusted Portgroup

Power down and edit the FreeNAS VM.  Change the VM Network to VM Network Promiscuous

Network Change

Enable Nested Virtualization, under CPU, Hardware virtualization, [x] Expose hardware assisted virtualization to the guest OS.

Enabled Nested Virtualization

After booting back up you should be able to create VMs and Docker Containers in FreeNAS under VMware.

And more….

Use at your own risk.

More topics may come later if I ever get around to it.

ZFS Dataset Hierarchy | Data Hoarder Edition

OpenZFS LogoZFS is flexible and will let you name and organize datasets however you choose–but before you start building datasets there’s some ways to make management easier in the long term.  I’ve found the following convention works well for me.  It’s not “the” way by any means, but I hope you will find it helpful, I wish some tips like this had been written when I built my first storage system 4 years ago.

Here are my personal ZFS best practices and naming conventions to structure and manage ZFS data sets.

ZFS Pool Naming

I never give two zpools the same name even if they’re in different servers in case there is the off-chance that sometime down the road I’ll need to import two pools into the same system.  I generally like to name my zpool tank[n] where is an incremental number that’s unique across all my servers.

So if I have two servers, say stor1 and stor2 I might have two zpools :

stor1.b3n.org: tank1
stor2.b3n.org: tank2

Top Level ZFS Datasets for Simple Recursive Management

Create a top level dataset called ds[n] where n is unique number across all your pools just in case you ever have to bring two separate datasets onto the same zpool.  The reason I like to create one main top-level dataset is it makes it easy to manage high level tasks recursively on all sub-datasets (such as snapshots, replication, backups, etc.).  If you have more than a handful of datasets you really don’t want to be configuring replication on every single one individually.  So on my first server I have:

tank1/ds1

I usually mount tank/ds1 as readonly from my CrashPlan VM for backups.  You can configure snapshot tasks, replication tasks, backups, all at this top level and be done with it.

freenas_snapshot_pruning
ZFS snaps and pruning recursively managed at the top level dataset

Name ZFS Datasets for Replication

One of the reasons to have a top level dataset is if you’ll ever have two servers…

stor1.b3n.org
   | - tank1/ds1

stor2.b3n.org
   | - tank2/ds2

I replicate them to each other for backup.  Having that top level ds[n] dataset lets me manage ds1 (the primary dataset on the server) completely separately from the replicated dataset (ds2) on stor1.

stor1.b3n.org
 | - tank1/ds1
 | - tank2/ds2 (replicated)

stor2.b3n.org
 | - tank2/ds2
 | - tank1/ds1 (replicated)

Advice for Data Hoarders.  Overkill for the Rest of Us

supermicro_zfs

The ideal is we backup everything.  But in reality storage costs money, WAN bandwidth isn’t always available to backup everything remotely.  I like to structure my datasets such that I can manage them by importance.  So under the ds[n] dataset create sub-datasets.

stor1.b3n.org
 | - tank1/ds1/kirk – very important – family pictures, personal files
 | - tank1/ds1/spock – important – ripped media, ISO files, etc.
 | - tank1/ds1/redshirt – scratch data, tmp data, testing area
 | - tank1/ds1/archive – archived data
 | - tank1/ds1/backups – backups

Kirk – Very Important.  Family photos, home videos, journal, code, projects, scans, crypto-currency wallets, etc.  I like to keep four to five copies of this data using multiple backup methods and multiple locations.  It’s backed up to CrashPlan offsite, rsynced to a friend’s remote server, snapshots are replicated to a local ZFS server, plus an annual backup to a local hard drive for cold storage.  3 copies onsite, 2 copies offsite, 2 different file-system types (ZFS, XFS) and 3 different backup technologies (CrashPlan, Rsync, and  ZFS replication) .  I do not want to lose this data.

Multiple Backup Locations Across the World
Important data is backed up to multiple geographic locations

Spock – Important.  Important data that would be a pain to lose, might cost money to reproduce, but it isn’t catastrophic.  If I had to go a few weeks without it I’d be fine.  For example, rips of all my movies, downloaded Linux ISO files, Logos library and index, etc.  If I lost this data and the house burned down I might have to repurchase my movies and spend a few weeks ripping them again, but I can reproduce the data.  For this dataset I want at least 2 copies, everything is backed up offsite to CrashPlan and if I have the space local ZFS snapshots are replicated to a 2nd server giving me 3 copies.

redshirt_startrek

Redshirt – This is my expendable dataset.  This might be a staging area to store MakeMKV rips until they’re transcoded, I might do video editing here or test out VMs.  This data doesn’t get backed up… I may run snapshots with a short retention policy.  Losing this data would mean losing no more than a days worth of work.  I might also run zfs sync=disabled to get maximum performance here.  And typically I don’t do ZFS snapshot replication to a 2nd server.  In many cases it will make sense to pull this out from under the top level ds[n] dataset and have it be by itself.

Backups – Dataset contains backups of workstations, servers, cloud services–I may backup the backups to CrashPlan or some online service and usually that is sufficient as I already have multiple copies elsewhere.

Archive – This is data I no longer use regularly but don’t want to lose. Old school papers that I’ll probably never need again, backup images of old computers, etc.  I set set this dataset to compression=gzip9, and back it up to CrashPlan plus a local backup and try to have at least 3 copies.

Now, you don’t have to name the datasets Kirk, Spock, and Redshirt… but the idea is to identify importance so that you’re only managing a few datasets when configuring ZFS snapshots, replication, etc.  If you have unlimited cheap storage and bandwidth it may not worth it to do this–but it’s nice to have the option to prioritize.

Now… once I’ve established that hierarchy I start defining my datasets that actually store data which may look something like this:

stor1.b3n.org
| - tank1/ds1/kirk/photos
| - tank1/ds1/kirk/git
| - tank1/ds1/kirk/documents
| - tank1/ds1/kirk/vmware-kirk-nfs
| - tank1/ds1/spock/media
| - tank1/ds1/spock/vmware-spock-nfs
| - tank1/ds1/spock/vmware-iso
| - tank1/ds1/redshirt/raw-rips
| - tank1/ds1/redshirt/tmp
| - tank1/ds1/archive
| - tank1/ds1/archive/2000
| - tank1/ds1/archive/2001
| - tank1/ds1/archive/2002
| - tank1/ds1/backups
| - tank1/ds1/backups/incoming-rsync-backups
| - tank1/ds1/backups/windows
| - tank1/ds1/backups/windows-file-history

 

With this ZFS hierarchy I can manage everything at the top level of ds1 and just setup the same automatic snapshot, replication, and backups for everything.  Or if I need to be more precise I have the ability to handle Kirk, Spock, and Redshirt differently.

 

FreeNAS Mini XL, 8 bay Mini-ITX NAS

Catching up on email, I saw a Newsletter from iX Systems announcing the FreeNAS Mini XL (the irony).  On the new FreeNAS Mini page it looks just like the FreeNAS mini but taller to accommodate 8-bays.

Available on Amazon starting at $1,500 with no drives.

Here’s the Quick Start Guide and Data Sheet.

The pictures show what appears to be equipped with the Asrock C2750d4i motherboard which has an 8-core Atom / Avoton processor.  With the upcoming FreeNAS 9.10 (based on FreeBSD 10) it should be able to run the bhyve hypervisor as well (at least from CLI, might have to wait until FreeNAS 10 for a bhyve GUI) meaning a nice all-in-one hypervisor with ZFS without the need for VT-d.   This may end up being a great successor to the HP Microserver for those wanting to upgrade with a little more capacity.

The case is the Ablecom CS-T80 so I imagine we’ll start seeing it from Supermicro soon as well.  According to Ablecom it has 8 hotswap bays plus 2 x 2.5″ internal bays and still managed to have room for a slim DVD/Blu-Ray drive.

ablecom_cs_t80It’s really great to see an 8-bay Mini-ITX NAS case that’s nicer than the existing options out there.  I hope the FreeNAS Mini XL will have an option for a more powerful motherboard even if it means having to use up the PCI-E slot with an HBA–I’m not really a fan of the Marvell SATA controllers on that board, and of course a Xeon-D would be nice.

 

 

VMware vs bhyve Performance Comparison

Playing with bhyve

Here’s a look at Gea’s popular All-in-one design which allows VMware to run on top of ZFS on a single box using a virtual 10Gbe storage network.  The design requires an HBA, and a CPU that supports VT-d so that the storage can be passed directly to a guest VM running a ZFS server (such as OmniOS or FreeNAS).  Then a virtual storage network is used to share the storage back to VMware.

vmware_all_in_one_with_storage_network
VMware and ZFS: All-In-One Design

bhyve, can simplify this design since it runs under FreeBSD it already has a ZFS server.  This not only simplifies the design, but it could potentially allow a hypervisor to run on simpler less expensive hardware.  The same design in bhyve eliminates the need to use a dedicated HBA and a CPU that supports VT-d.

freebsd_bhyve
Simpler bhyve design

I’ve never understood the advantage of type-1 hypervisors (such as VMware and Xen) over Type-2 hypervisors (like KVM and bhyve).  Type-1 proponents say the hypervisor runs on bare metal instead of an OS… I’m not sure how VMware isn’t considered an OS except that it is a purpose-built OS and probably smaller.  It seems you could take a Linux distribution running KVM and take away features until at some point it becomes a Type-1 hypervisor.  Which is all fine but it could actually be a disadvantage if you wanted some of those features (like ZFS).  A type-2 hypervisor that supports ZFS appears to have a clear advantage (at least theoretically) over a type-1 for this type of setup.

In fact, FreeBSD may be the best visualization / storage platform.  You get ZFS and bhyve, and also jails.  You really only need to run bhyve when virtualizing a different OS.

bhyve is still pretty young, but I thought I’d run some tests to see where it’s at…

Environments

This is running on my X10SDV-F Datacenter in a Box Build.

In all environments the following parameters were used:

  • Supermico X10SDV-F
  • Xeon D-1540
  • 32GB ECC DDR4 memory
  • IBM ServerRaid M1015 flashed to IT mode.
  • 4 x HGST Ultrastar 7K300 HGST 2TB enterprise drives in RAID-Z
  • One DC S3700 100GB over-provisioned to 8GB used as the log device.
  • No L2ARC.
  • Compression = LZ4
  • Sync = standard (unless specified).
  • Guest (where tests are run): Ubuntu 14.04 LTS, 16GB, 4 cores, 1GB memory.
  • OS defaults are left as is, I didn’t try to tweak number of NFS servers, sd.conf, etc.
  • My tests fit inside of ARC.  I ran each test 5 times on each platform to warm up the ARC.  The results are the average of the next 5 test runs.
  • I only tested an Ubuntu guest because it’s the only distribution I run in (in quantity anyway) addition to FreeBSD, I suppose a more thorough test should include other operating systems.

The environments were setup as follows:

1 – VM under ESXi 6 using NFS storage from FreeNAS 9.3 VM via VT-d

  • FreeNAS 9.3 installed under ESXi.
  • FreeNAS is given 24GB memory.
  • HBA is passed to it via VT-d.
  • Storage shared to VMware via NFSv3, virtual storage network on VMXNET3.
  • Ubuntu guest given VMware para-virtual drivers

2 – VM under ESXi 6 using NFS storage from OmniOS VM via VT-d

  • OmniOS r151014 LTS installed under ESXi.
  • OmniOS is given 24GB memory.
  • HBA is passed to it via VT-d.
  • Storage shared to VMware via NFSv3, virtual storage network on VMXNET3.
  • Ubuntu guest given VMware para-virtual drivers

3 – VM under FreeBSD bhyve

  • bhyve running on FreeBSD 10.1-Release
  • Guest storage is file image on ZFS dataset.

4 – VM under FreeBSD bhyve sync always

  • bhyve running on FreeBSD 10.1-Release
  • Guest storage is file image on ZFS dataset.
  • Sync=always

Benchmark Results

MariaDB OLTP Load

This test is a mix of CPU and storage I/O.  bhyve (yellow) pulls ahead in the 2 threaded test, probably because it doesn’t have to issue a sync after each write.  However, it falls behind on the 4 threaded test even with that advantage, probably because it isn’t as efficient at handling CPU processing as VMware (see next chart on finding primes).
sysbench_oltp

Finding Primes

Finding prime numbers with a VM under VMware is significantly faster than under bhyve.

sysbench_primes

Random Read

byhve has an advantage, probably because it has direct access to ZFS.

sysbench_rndrd

Random Write

With sync=standard bhyve has a clear advantage.  I’m not sure why VMware can outperform bhyve sync=always.  I am merely speculating but I wonder if VMware over NFS is translating smaller writes into larger blocks (maybe 64k or 128k) before sending them to the NFS server.

sysbench_rndwr

Random Read/Write

sysbench_rndrw

Sequential Read

Sequential reads are faster with bhyve’s direct storage access.

sysbench_seqrd

Sequential Write

What not having to sync every write will gain you..

sysbench_seqwr

Sequential Rewrite

sysbench_seqrewr

 

Summary

VMware is a very fine virtualization platform that’s been well tuned.  All that overhead of VT-d, virtual 10gbe switches for the storage network, VM storage over NFS, etc. are not hurting it’s performance except perhaps on sequential reads.

For as young as bhyve is I’m happy with the performance compared to VMware, it appears to be a slower on the CPU intensive tests.   I didn’t intend on comparing CPU performance so I haven’t done enough variety of tests to see what the difference is there but it appears VMware has an advantage.

One thing that is not clear to me is how safe running sync=standard is on bhyve.  The ideal scenario would be honoring fsync requests from the guest, however I’m not sure if bhyve has that kind of insight from the guest.  Probably the worst case under this scenario with sync=standard is losing the last 5 seconds of writes–but even that risk can be mitigated with battery backup. With standard sync there’s a lot of performance to be gained over VMware with NFS.  Even if you run bhyve with sync=always it does not perform badly, and even outperforms VMware All-in-one design on some tests.

The upcoming FreeNAS 10 may be an interesting hypervisor + storage platform, especially if it provides a GUI to manage bhyve.

 

Supermicro X10SDV-F Build; Datacenter in a Box

I don’t have room for a couple of rackmount servers anymore so I was thinking of ways to reduce the footprint and noise from my servers.  I’ve been very happy with Supermicro hardware so here’s my Supermicro Mini-ITX Datacenter in a box build.

Supermicro Microtower

Supermicro X10SDV Motherboard

Unlike most processors, the Xeon D is SOC (System on Chip) meaning that it’s built into the motherboard.  Depending on your compute needs, you’ve got a lot of pricing / power flexibility with the Mini-ITX Supermicro X10SDV motherboards with the Xeon D SOC CPU ranging from a budget build of 2 cores to a ridiculous 16 cores rivaling high end Xeon E5 class processors!

How many cores do you want?  CPU/Motherbord Options

x10sdv-4c-tln2f_spec Supermicro board with fan

A few things to keep in mind when choosing a board.  Some come with a FAN (normally indicated by a + after the core count), some don’t.  I suggest getting it with a fan unless you’re putting some serious air flow (such as with a 1U server) through the heatsink.  I got one without a fan and had to do a Noctua mod (below).

Many versions of this board are rated for 7-years lifespan which means they have components designed to last longer than most boards!  Usually computers go obsolete before they die anyway, but it’s nice to have that option if you’re looking for a permanent solution.  A VMware / NAS server that’ll last you 7-years isn’t bad at all!

On the last 5 digits, you’ll see two options: “-TLN2F” and “-TLN4F” this refers to the number network Ethernet ports (N2 comes with 2 x gigabit ports, and N4 usually comes with 2 gigabit plus 2 x 10 gigabit ports).  10 gbe ports may come in handy for storage, and also having 4 ports may be useful if you’re going to run a router VM such as pfSense.

 

I bought the first model just known as the “X10SDV-F” which comes with 8 cores and 2 gigabit network ports.  This board looks like it’s designed for high density computing.  It’s like cramming dual Xeon E5’s into a Mini-ITX board.  The Xeon D-1540 will well outperform the Xeon E3-1230v3 in most tests, can handle up to 128GB memory, two nics (this also comes in a model that offers two more 10Gbe providing four nics), IPMI, 6 SATA-3 ports, a PCI-E slot, and an M.2 slot.

Supermicro X10SDV-F Motherboard
Supermicro X10SDV-F

IPMI / KVM Over-IP / Out of Band Management

One of the great features of these motherboards is you will never need to plug in a keyboard, mouse, or monitor.  In addition to the 2 or 4 normal Ethernet ports, there is one port off to the side, the management port.  Unlike HP iLO, this is a free feature on the Supermicro motherboards.  The IPMI interface will get a DHCP address. You can download the Free IPMIView software from Supermicro, or use the Android app to scan your network for the IP address.  Login as ADMIN / ADMIN (be sure to change the password).

Supermicro IPMI KVM over IP

You can even reset or power off, and even if the power is off you can power on the server remotely.

 

Supermicro KVM

And of course you also get KVM over IP, which is so low level you can get into the BIOS and even load an ISO file from your workstation to boot off of over the network!

When I first saw IPMI I made sure all my new servers have it.  I hate messing around with keyboards and mice and monitors and I don’t have room for a hardware based KVM solution.  This out of band management port is the perfect answer.  And the best part is the ability to manage your server from remote.  I have used this to power on servers and load ISO files in California from Idaho.

I should note that I would not be exposing the IPMI port over the internet, make sure it’s on it’s behind a firewall accessible only through VPN.

Cooling issue | heatsink not enough

The first boot was fine but it crashed after about 5 minutes while I was in the BIOS setup…. after a few resets I couldn’t even get it to post.  I finally realized the CPU was getting too hot.  Supermicro probably meant for this model to be in a 1U case with good air flow.  The X10SDV-TLN4F is a little extra but it comes with a CPU fan in addition to the 10Gbe network adapters so keep that in mind if you’re trying to decide between the two boards.

Noctua to the Rescue

I couldn’t find a CPU fan designed to fit this particular socket, so I bought a 60MM Noctua NF-A6x25

60MM Noctua FAN on X10SDV-F
60MM Noctua Fan

This is my first Noctua fan and I think it’s the nicest fan I’ve ever owned.  It came packaged with screws, rubber leg things, an extension cord, a molex power adapter, and two noise reducer cables that slow the fan down a bit.  I actually can’t even hear the fan running at normal speed.

Noctua Fan on Xeon D-1540 X10SDV-F
Notcua Fan on Xeon D-1540

There’s not really a good way to screw the fan and the heatsink into the motherboard together, but I took the four rubber things and sort of tucked them under the heatsink screws.  This is  surprisingly a secure fit, it’s not ideal but the fan is not going to go anywhere.

Supermicro CSE-721TQ-250B

This is what you would expect from Supermicro, a quality server-grade case.  It comes with a 250 watt 80 plus power supply.  Four 3.5″ hotswap bays, trays are the same as you would find on a 16 bay enterprise chassis.  Also it comes with labels numbered from 0 to 4 so you could choose to label starting at 0 (the right way) or 1.  It is designed to fit two fixed 2.5″ drives, one on the side of the HDD cage, and the other can be used on top instead of an optical drive.

The case is roomy enough to work with, I had no trouble adding an IBM ServerRAID M1015 / LSI  9220-8i

CS721

 

I took this shot just to note that if you could figure out a way to secure an extra drive, there is room to fit three drives, or perhaps two drives even with an optical drive, you’d have to use a Y-splitter to power it.  I should also note that you could use the M.2. slot to add another SSD.

supermicro_x10sdv-f_sc721_opened

The case is pretty quiet, I cannot hear it at all with my other computers running in the same room so I’m not sure how much noise it makes.

This case reminds me of the HP Microserver Gen8 and is probably about the same size and quality but I think a little more roomier and with Supermicro IPMI is free.

Compared to the Silverstone DS380 the Supermicro CS721 is a more compact.  The DS380 has the advantage of being able to hold more drives.  The DS380 can fit 8 3.5″ or 2.5″ in hotswap bays plus an additional four 2.5″ fixed in a cage.  Between the two cases I much prefer the Supermicro CS-721 even with less drive capacity.  The DS380 has vibration issues with all the drives populated, and it’s also not as easy to work with.  The CS-721 looks and feels much higher quality.

Storage Capacity

cs721_open_doorI loaded mine with two Intel DC S3700 SSDs and 4 x 6TB drives in RAID-Z (RAID-5) the case can provide up to 18TB of storage which is a good amount for any data hoarder wanting to get started.

I think the Xeon D platform offers great value with a great range of power and pricing options.  The prices on the Xeon D motherboards are reasonable considering the Motherboard and CPU are combined, if you went with a Xeon E3 or E5 platform you’d be paying about the same or more to purchase them separately.   You’ll be paying anywhere from $350 to $2500 depending on how many cores you want.

Core Count Recommendations

For a NAS only box such as FreeNAS, OmniOS+NappIt, NAS4Free, etc. or a VMware All in one with FreeNAS and one or two light guest VMs I’d go with a simple 2C CPU.

For a bhyve or VMware + ZFS an all-in-one I think the 4C is a great starter board, it will handle probably a lot more than most people need for a home server running a handful of VMs including the ability to trans-code with a Plex or Emby server.

From there you can get 6C, 8C, 12C, or 16C, as you start getting more cores the clock frequency starts to go down so you don’t want to go overboard unless you really do need to use those cores.  Also, consider that you may prefer to get two or three smaller boards to allow failover instead of one powerful server.

What Do I Run On My Server Under My Desk?

Other Thoughts

cs721_frontI’m pretty happy with the build, I really like how much power you can get into a microserver these days.  My build has 8 cores (16 threads) and 32GB memory (can go up to 128GB!), and with 6TB drives in RAID-Z (RAID-5) I have 18TB of usable data (more with ZFS compression).  With VMware and ZFS you could run a small datacenter from a box under your desk.

 

FreeNAS 9.10 on VMware ESXi 6.0 Guide

This is a guide which will install FreeNAS 9.10 under VMware ESXi and then using ZFS share the storage back to VMware.  This is roughly based on Napp-It’s All-In-One design, except that it uses FreeNAS instead of OminOS.

vmware_all_in_one_with_storage_network

Disclaimer:  I should note that FreeNAS does not officially support running virtualized in production environments.  If you run into any problems and ask for help on the FreeNAS forums, I have no doubt that Cyberjock will respond with “So, you want to lose all your data?”  So, with that disclaimer aside let’s get going:

Update: Josh Paetzel wrote a post on Virtualizing FreeNAS so this is somewhat “official” now.  I would still exercise caution.

Update 2: This guide was originally written for FreeNAS 9.3, I’ve updated it for FreeNAS 9.10.  Also, I believe Avago LSI P20 firmware bugs have been fixed and have been around long enough to be considered stable so I’ve removed my warning on using P20.  Added sections 7.1 (Resource reservations) and 16.1 (zpool layouts) and some other minor updates.

1. Get proper hardware

Example 1: Supermicro 2U Build
SuperMicro X10SL7-F (which has a built in LSI2308 HBA).
Xeon E3-1240v3
ECC Memory
6 hotswap bays with 2TB HGST HDDs (I use RAID-Z2)
4 2.5″ hotswap bays.  2 Intel DC S3700’s for SLOG / ZIL, and 2 drives for installing FreeNAS (mirrored)

Example 2: Mini-ITX Datacenter in a Box Build
X10SDV-F (build in Xeon D-1540 8 core broadwell
ECC Memory
IBM 1015 / LSI 9220-8i HBA
4 hotswap bays with 2TB HGST HDDs (I use RAID-Z)
2 Intel DC S3700’s.  1 for SLOG / ZIL, and one to boot ESXi and install FreeNAS to.

Hard drives.  See info on my Hard Drives for ZFS post.

The LSI2308/M1015 has 8 ports, I like do to two DC S3700s for a striped SLOG device and then do a RAID-Z2 of spinners on the other 6 slots.  Also get one (preferably two for a mirror) drives that you will plug into the SATA ports (not on the LSI controller) for the local ESXi data store.  I’m using DC S3700s because that’s what I have, but this doesn’t need to be fast storage, it’s just to put FreeNAS on.

2. Flash HBA to IT Firmware

As of FreeNAS 9.3.1 or greater you should be flashing to IT mode P20 (looks like it’s P21 now but it’s not available by every vendor yet).

I strongly suggest pulling all drives before flashing.

 LSI 2308 IT firmware for Supermicro

Here’s instructions to flash the firmware: http://hardforum.com/showthread.php?t=1758318

Supermicro firmware: ftp://ftp.supermicro.com/Driver/SAS/LSI/2308/Firmware/IT/

For IBM M1015 / LSI Avago 9220-8i

Instructions for flashing firmware:

https://forums.freenas.org/index.php?threads/confirmation-please-lsi-9211-i8-flashing-to-p20.40373/

LSI / Avago Firmware: http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9211-8i#downloads

(If you already have the card passed through to FreeNAS via VT-d (steps 6-8) you can actually flash the card from FreeNAS using the sas2flash utility using the steps below (in this example my card is already in IT mode so I’m just upgrading it):

(Wait a few minutes, at this point FreeNAS finally crashed.  Poweroff.  FreeNAS, and then reboot VMware)

Warning on P20 buggy firmware:

Some earlier versions of the P20 firmware were buggy, so make sure it’s version P20.00.04.00 or later.  If you can’t P20 in aversion later than P20.00.04.00 then use P19 or P16.

3. Optional: Over-provision ZIL / SLOG SSDs.

If you’re going to use an SSD for SLOG you can over-provision them.  You can boot into an Ubuntu LiveCD and use hdparm, instructions are here: https://www.thomas-krenn.com/en/wiki/SSD_Over-provisioning_using_hdparm  You can also do this after after VMware is installed by passing the LSI controller to an Ubuntu VM (FreeNAS doesn’t have hdparm).  I usually over-provision down to 8GB.

Update 2016-08-10: But you may want to only go to 20GB depending on your setup!  One of my colleagues discovered 8GB over-provisioning wasn’t even maxing out 10Gb network (remember, every write to VMware is a sync so it hits the ZIL no matter what) with 2 x 10Gb fiber lagged connections between VMware and FreeNAS.  This was on an HGST 840z so not sure if the same holds true for the Intel DC S3700… and it wasn’t virtualized setup.  But thought I’d mention it here.

4. Install VMware ESXi 6

ImageThe free version of the hypervisor is here. http://www.vmware.com/products/vsphere-hypervisor.  I usually install it to a USB drive plugged into the motherboard’s internal header.

Under configuration, storage, click add storage.  Choose one (or two) of the local storage disks plugged into your SATA ports (do not add a disk on your LSI controller).

5. Create a Virtual Storage Network.

For this example my VMware management IP is 10.2.0.231, the VMware Storage Network ip is 10.55.0.2, and the FreeNAS Storage Network IP is 10.55.1.2.

Create a virtual storage network with jumbo frames enabled.

VMware, Configuration, Add Networking. Virtual Machine…

Create a standard switch (uncheck any physical adapters).

Image [8]

Image [11]

 

Add Networking again, VMKernel, VMKernel…  Select vSwitch1 (which you just created in the previous step), give it a network different than your main network.  I use 10.55.0.0/16 for my storage so you’d put 10.55.0.2 for the IP and 255.255.0.0 for the netmask.

Image [12]

Some people are having trouble with an MTU of 9000.  I suggest leaving the MTU at 1500 and make sure everything works there before testing an MTU of 9000.  Also, if you run into networking issues look at disabling TSO offloading (see comments).

Under vSwitch1 go to Properties, select vSwitch, Edit, change the MTU to 9000.  Answer yes to the no active NICs warning.

Image [14]

Image [15]

Then select the Storage Kernel port, edit, and set the MTU to 9000.

Image [17]

Image [18]

6. Configure the LSI 2308 for Passthrough (VT-d).

Configuration, Advanced Settings, Configure Passthrough.

Image [19]

Mark the LSI2308 controller for passthrough.

Image [20]

You must have VT-d enabled in the BIOS for this to work so if it won’t let you for some reason check your BIOS settings.

Reboot VMware.

7. Create the FreeNAS VM.

Download the FreeNAS ISO from http://www.freenas.org/download-freenas-release.html

Create a new VM, choose custom, put it on one of the drives on the SATA ports, Virtual Machine version 11, Guest OS type is FreeBSD 64-bit, 1 socket and 2 cores.  Try to give it at least 8GB of memory.  On Networking give it two adapters, the 1st NIC should be assigned to the VM Network, 2nd NIC to the Storage network.  Set both to VMXNET3.

vmware_freenas_networking

SCSI controller should be the default, LSI Logic Parallel.

Choose Edit the Virtual Machine before completion.

If you have a second local drive (not one that you’ll use for your zpool) here you can add a second boot drive for a mirror.

Before finishing the creation of the VM click Add, select PCI Devices, and choose the LSI 2308.

Image [32]

And be sure to go into the CD/DVD drive settings and set it to boot off the FreeNAS iso.  Then finish creation of the VM.

7.1 FreeNAS VM Resource allocation

Also, since FreeNAS will be driving the storage for the rest of VMware, it’s a good idea to make sure it has a higher priority for CPU and Memory than other guests.  Edit the virtual machine, under Resources set the CPU Shares to “High” to give FreeNAS a higher priority, then under Memory allocation lock the guest memory so that VMware doesn’t ever borrow from it for memory ballooning.  You don’t want VMware to swap out ZFS’s ARC (memory read cache).

freenas_vmware_cpu_resource allocation

freenas_vmware_lock_guest_memory

 

8. Install FreeNAS.

Boot of the VM, install it to your SATA drive (or two of them to mirror boot).

freenas_installer_mirrored_boot

After it’s finished installing reboot.

9. Install VMware Tools.

SKIP THIS STEP.  As of FreeNAS 9.10.1 installing VMware should may no longer be necessary–you can skip step 9 and go to 10.  Just leaving this for historical purposes.

In VMware right-click the FreeNAS VM,  Choose Guest, then Install/Upgrade VMware Tools.  You’ll then choose interactive mode.

Mount the CD-ROM and copy the VMware install files to FreeNAS:

Once installed Navigate to the WebGUI, it starts out presenting a wizard, I usually set my language and timezone then exit the rest of the wizard.

Under System, Tunables
Add a Tunable.  Variables should be: vmxnet3_load.  The type should be Loader and the Value YES .

Reboot FreeNAS.  On reboot you should notice that the VMXNET3 NICS now work (except the NIC on the storage network can’t find a DHCP server, but we’ll set it to static later), also you should notice that VMware is now reporting that VMware tools are installed.

vmware_tools

If all looks well shutdown FreeNAS (you can now choose Shutdown Guest from VMware to safely power it off), remove the E1000 NIC and boot it back up (note that the IP address on the web gui will be different).

10.  Update FreeNAS

Before doing anything let’s upgrade FreeNAS to the latest stable under System Update.

This is a great time to make some tea.

Once that’s done it should reboot.  Then I always go back again and check for updates again to make sure there’s nothing left.

11. SSL Certificate on the Management Interface (optional)

On my DHCP server I’ll give FreeNAS a static/reserved IP, and setup an entry for it on my local DNS server.  So for this example I’ll have a DNS entry on my internal network for stor1.b3n.org.

If you don’t have your own internal Certificate Authority you can create one right in FreeNAS:

System, CAs, Create internal CA.  Increase the key length to 4096 and make sure the Digest Algorithm is set to SHA256.

freenas_create_internal_ca

Click on the CA you just created, hit the Export Certificate button, click on it to install the Root certificate you just created on your computer.  You can either install it just for your profile or for the local machine, I usually do local machine, and you’ll want to make sure to store it is in the Trusted Root Certificate Authorities store.

certificate_store

trusted_root_store

Just a warning, that you must keep this Root CA guarded, if a hacker were to access this he could generate certificates to impersonate anyone (including your bank) to initiate a MITM attack.

Also Export the Private Key of the CA and store it some place safe.

Now create the certificate…

System, Certificates, Create Internal Certificate.  Once again bump the key length to 4096.  The important part here is the Common Name must match your DNS entry.  If you are going to access FreeNAS via IP then you should put the IP address in the Common Name field.

freenas_create_certificate

System, Information.  Set the hostname to your dns name.

System, General.  Change the protocol to HTTPS and select the certificate you created.  Now you should be able to go to use https to access the FreeNAS WebGUI.

12. Setup Email Notifications

Account, Users, Root, Change Email, set to the email address you want to receive alerts (like if a drive fails or there’s an update available).

System, Advanced

Show console messages in the footer.  Enable (I find it useful)

System Email…

Fill in your SMTP server info… and send a test email to make sure it works.

13.  Setup a Proper Swap

FreeNAS by default creates a swap partition on each drive, and then stripes the swap across them so that if any one drive fails there’s a chance your system will crash.  We don’t want this.

System, Advanced…

Swap size on each drive in GiB, affects new disks only. Setting this to 0 disables swap creation completely (STRONGLY DISCOURAGED).   Set this to 0.

Open the shell.  This will create a 4GB swap file (based on https://www.freebsd.org/doc/handbook/adding-swap-space.html)

If you are on FreeNAS 9.10

System, Tasks, Add Init/Shutdown Script, Type=Command.  Command:

When = Post Init

swap_post_init

If you are on FreeNAS 9.3

System, Tunables, Add Tunable.

Variable=swapfile, Value=/usr/swap0, Type=rc.conf

swap_tunable

Back to Both:

Next time you reboot on the left Navigation pane click Display System Processes and make sure the swap shows up.  If so it’s working.

swap

14. Configure FreeNAS Networking

Setup the Management Network (which you are currently using to connect to the WebGUI).

Network, Interfaces, Add Interface, choose the Management NIC, vmx3f0, and set to DHCP.

management_nic

Setup the Storage Network

Add Interface, choose the Storage NIC, vmx3f1, and set to 10.55.1.2 (I setup my VMware hosts on 10.55.0.x and ZFS servers on 10.55.1.x), be sure to select /16 for the netmask.  And set the mtu to 9000.

storage_network2

Open a shell and make sure you can ping the ESXi host at 10.55.0.2

Reboot.  Let’s make sure the networking and swap stick.

15. Hard Drive Identification Setup

Label Drives.   FreeNAS is great at detecting bad drives, but it’s not so great at telling you which physical drive is having an issue.  It will tell you the serial number and that’s about it.  But how confident are you in knowing which drive fails?  If FreeNAS tells you that disk da3 (by the way, all these da numbers can change randomly) is having an issue how do you know which drive to pull?  Under Storage, View Disks, you can see the serial number, this still isn’t entirely helpful because chances are you can’t see the serial number without pulling a drive.  So we need to map them to slot numbers or labels of some sort.

disks

There are two ways you can deal with this.  The first, and my preference, is sas2ircu.  Assuming you connected the cables between the LSI 2308 and the backplane in proper sequence sas2ircu will tell you the slot number the drives are plugged into on the LSI controller.  Also if you’re using a backplane with an expander that supports SES2 it should also tell you which slots the drives are in.  Try running this command:

sas2ircu

You can see that it tells you the slot number and maps it to the serial number.  If you are comfortable that you know which physical drive each slot number is in then you should be okay.

If not, the second method, is remove all the drives from the LSI controller, and put in just the first drive and label it Slot 0 in the GUI by clicking on the drive, Edit, and enter a Description.

label_drive

labeled_drive

Put in the next drive in Slot 1 and label it, then insert the next drive and label it Slot 2 and so on…

The Description will show up in FreeNAS and it will survive reboots.  it will also follow the drive even if you move it to a different slot.  So it may be more appropriate to make your description match a label on the removable trays rather than the bay number.

It doesn’t matter if you label the drives or use sas2ircu, just make sure you’re confident that you can map a serial number to a physical drive before going forward.

16.1 Choose Pool Layout

For high performance the best configuration is to maximize the number of VDEVs by creating mirrors (essentially RAID-10).  That said, with my 6-drive RAID-Z2 array with 2 DC S3700 SSDs for SLOG/ZIL my setup performs very well with VMware in my environment.  If you’re running heavy random I/O mirrors are more important, but if you’re just running a handful of VMs RAID-Z / RAID-Z2 will probably offer great performance as long as you have a good SSD for SLOG device.   I like to start double parity at 5 or 6 disk VDEVs, and triple parity at 9 disks.  Here some some sample configurations:

Example zpool / vdev configurations

2 disks = 1 mirror
3 disks = RAID-Z
4 disks = RAID-Z or 2 mirrors
5 disks = RAID-Z, or RAID-Z2, or 2 mirrors with hot spare.
(Don’t configure 5 disks with 4 drives being in RAID-Z plus 1 hot spare–that’s just ridiculous.  Make it a RAID-Z2 to begin with).
6 disks = RAID-Z2, or 3 mirrors
7 disks = RAID-Z2, or 3 mirrors plus hot spare
8 disks = RAID-Z2, or 4 mirrors
9 disks = RAID-Z3, or 4 mirrors plus hot spare
10 disks = RAID-Z3, 2 vdevs of 5 disk RAID-Z2 or 5 mirrors
11 disks = RAID-Z3, 2 vdevs of 5 disk RAID-Z2 plus hot spare or 5 mirrors with hot spare
12 disks = 2 vdevs of 6 disk RAID-Z2, or 5 mirrors with 2 hot spares
13 disks = 2 vdevs of 6 disk RAID-Z2 plus hot spare or 5 mirrors with one hot spare
14 disks = 2 vdevs of 7 disk RAID-Z2 or 6 mirrors plus 2 hot spares
15 disks = 3 vdevs of 5 disk RAID-Z2 or 7 mirrors with 1 hot spare
16 disks = 3 vdevs of 5 disk RAID-Z2 plus hot spare or 7 mirrors with 2 hot spares
17 disks = 3 vdevs of 5 disk RAID-Z2 plus hot spares or 7 mirrors with 3 hot spares
18 disks = 2 vdevs of 9 disk RAID-Z3, 3 vdevs of 6 disk RAID-Z2 or 8 mirrors with 2 hot spares
19 disks = 2 vdevs of 9 disk RAID-Z3, 3 vdevs of 6 disk RAID-Z2 plus hot spares or 8 mirrors with 3 hot spares
20 disks = 2 vdevs of 10 disk RAID-Z3 4 vdevs of 5 disk RAID-Z2 plus hot spares or 9 mirrors with 2 hot spares

Anyway, that gives you a rough idea.  The more vdevs the better random performance.  It’s always a balance between capacity, performance, and safety.

16.2  Create the Pool.

Storage, Volumes, Volume Manager.

Click the + next to your HDDs and add them to the pool as RAID-Z2.

Click the + next to the SSDs and add them to the pool.  By default the SSDs will be on one row and two columns.  This will create a mirror.  If you want a stripe just add one Log device now and add the second one later.  Make certain that you change the dropdown on the SSD to “Log (ZIL)”  …it seems to lose this setting anytime you make any other changes so change that setting last.  If you do not do this you will stripe the SSD with the HDDs and possibly create a situation where any one drive failure can result in data loss.

create_pool_2

Back to Volume manager and add the second Log device…

add_zil

I have on numerous occasions had the Log get changed to Stripe after I set it to Log, so just double-check by clicking on the top level tank, then the volume status icon and make sure it looks like this:

good_tank

17.  Create an NFS Share for VMware

You can create either an NFS share, or iSCSI share (or both) for VMware.  First here’s how to setup an NFS share:

Storage, Volumes, Select the nested Tank, Create Data Set

Be sure to disable atime.

freenas_vmware_nfs

Sharing, NFS, Add Unix (NFS) Share.   Add the vmware_nfs dataset, and grant access to the storage network, and map the root user to root.

add_nfs_share

Answer yes to enable the NFS service.

In VMware, Configuration, Add Storage, Network File System and add the storage:

mount_nfs

And there’s your storage!

nfs_mounted

18.  Create an iSCSI share for VMware

WARNING: Note that at this time, based on some of the comments below with people having connection drop issues on iSCSI I suggest testing with heavy concurrent loads to make sure it’s stable.  Watch dmesg and /var/log/messages on FreeNAS for iSCSI timeouts.  Personally I use NFS.  But here’s how to enable iSCSI:

Storage, select the nested tank, Create Zvol.  Be sure compression is set to lz4.  Check Sparse Volume.  Choose advanced mode and optionally change the default block size.  I use 64K block-size based on some benchmarks I’ve done comparing 16K (the default), 64K, and 128K.  64K blocks didn’t really hurt random I/O but helped some on sequential performance, and also gives a better compression ratio.  128K blocks had the best better compression ratio but random I/O started to suffer so I think 64K is a good middle-ground.  Various workloads will probably benefit from different block sizes.

vmware_block

Sharing, Block (iSCSI), Target Global Configuration.

Set the base name to something sensible like: iqn.2011-03.org.b3n.stor1.istgt  Set Pool Available Space Threshold to 60%

iscsi _target_global

Portals tab… add a portal on the storage network.

iscsi_portal

Initiator.  Add Initiator.

add_initiator

Targets.  Add Target.

add_target_3

Extents.  Add Extent.

vmware_block_extent

Associated Targets.  Add Target / Extent.

add_target_extent

Under Services enable iSCSI.

In VMware Configutration, Storage Adapters, Add Adapter, iSCSI.

Select the iSCSI Software Adapter in the adapters list and choose properties.  Dynamic discovery tab.  Add…

add_iscsi_server

Close and re-scan the HBA / Adapter.

You should see your iSCSI block device appear…

iscsi_working

Configuration, Storage, Add Storage, Disk/LUN, select the FreeBSD iSCSi Disk,

block_device_in_vmware

19.  Setup ZFS VMware-Snapshot coordination.

This will coordinate with VMware to take clean snapshots of the VMs whenever ZFS takes a snapshot of that dataset.

Storage.  Vmware-Snapshot.  Add VMware-Snapshot.  Map your ZFS dataset to the VMware data store.

ZFS / VMware snapshots of NFS example.

vmware_snapshot_coordination_nfs

ZFS / VMware snapshots of iSCSI example.

vmware_snapshot_coordination_block

20. Periodic Snapshots

Add periodic snapshot jobs for your VMware storage under Storage, Periodic Snapshot Tasks.  You can setup different snapshot jobs with different retention policies.

snap_tasks

21. ZFS Replication

If you have a second FreeNAS Server (say stor2.b3n.org) you can replicate the snapshots over to it.  On stor1.b3n.org, Replication tasks, view public key. copy the key to the clipboard.

On the server you’re replicating to, stor2.b3n.org, go to Account, View Users, root, Modify User, and paste the public key into the SSH public Key field.  Also create a dataset called “replicated”.

Back on stor1.b3n.org:

Add Replication.  Do an SSH keyscan.

add_replication2

And repeat for any other datasets.  Optionally you could also just replicate the entire pool with the recursive option.

22.  Automatic Shutdown on UPS Battery Failure (Work in Progress)

The goal is on power loss, before the battery fails to shutdown all the VMware guests including FreeNAS.  So far all I have gotten is the APC working with VMware.  Edit the VM settings and add a USB controller, then add a USB device and select the UPS, in my case a APC Back-UPS ES 550G.  Power FreeNAS back on.

On the shell type:

dmesg|grep APC

This will tell you where the APC device is.  IN my case it’s showing up on ugen0.4.  I ended up having to grant world access to the UPS…

For some reason I could not get the GUI to connect to the UPS, I can selected ugen0.4, but under the drivers dropdown I just have hyphens —— … I set it manually in /usr/local/etc/nut/ups.conf

However, this file gets overwritten on reboot, and also the rc.conf setting doesn’t seem to stick.  I added this tunable to get the rc.conf setting…

nut_enable

And I created my ups.conf file in /mnt/tank/ups.conf.  Then I created a script to stop the nut service, copy my config file and restart the nut service in /mnt/tank/nutscript.sh

Then under tasks, Init/Shutdown Scripts I added a task to run the script post init.

post_init

Next step is to configure automatic shutdown of the VMware server and all guests on it…  I have not done this yet.

There’s a couple of approaches to take here.  One is to install a NUT client on the ESXi, and the other is to have FreeNAS ssh into VMware and tell it to shutdown.  I may update this section later if I ever get around to implementing it.

23. Backups

Before going live make sure you have adequate backups!  You can use ZFS replication with a fast link.  For slow network connections Rsync will work better (Took under Tasks -> Rsync tasks) or use a cloud service like CrashPlan.   Here’s a nice CrashPlan on FreeNAS Howto.

BACKUPS BEFORE PRODUCTION.  I can’t stress this enough, don’t rely on ZFS’s redundancy alone, always have backups (one offsite, one onsite) in place before putting anything important on it.

 Setup Complete… mostly.

Well, that’s all for now.