Good morning. Today is March 31st. 🌐 It’s World Backup Day. This is a good day to review your backup strategy. I thought I’d share a cloud backup experiment I’ve been running for 9-months. I hope you find it helpful.
I needed a cloud service for my offsite backups… the two prominent services are BackBlaze B2 and Amazon S3. I wasn’t sure which would be cheaper. But there’s one way to find out.
For the last 9 months I’ve been backing up my TrueNAS data to both BackBlaze B2 and AWS S3.
About the NAS Data
My TrueNAS Data that I’m backing up to the cloud consists of the following:
- 1.39 TB Archive data (mostly a collection of data archived in lots of .7z compressed files organized by year with the last couple of years uncompressed)
- 1.65 TB SMB share (my Automatic Ripping Machine target is here, and lots of small documents and files as well).
- 0.46 TB NFS share to Proxmox Backup Server. This data is lots of small backup files that can reproduce VMs and Container block storage and probably changes a lot.
BackBlaze B2 Target
BackBlaze is a simple $5/TB/month with no ingress fees. Here’s the B2 Pricing page. Small egress fees to restore. Pretty straightforward.
There are also transaction fees (based on the number of API calls), but these are pretty minimal if using the rclone –fast-list option, so you can almost ignore them.
Amazon AWS S3 Target
Amazon S3 is not that simple… here’s the S3 pricing page. You’ve got different storage tiers:
- S3 Standard $23/TB/month
- S3 Standard – Infrequent Access – $12/TB/month
- S3 Glacier Instant Retrieval $4/TB/month
- S3 Flexible Retrieval $3.60/TB/month
- S3 Glacier Deep Archive $0.99/TB/month
- S3 Intelligent – Tiering which automatically moves your data into the best tier based on usage patterns.
Egress from S3 is atrocious; in addition to retrieval fees, data transfer fees to download your data to a location outside Amazon is $90/TB. However, the likelihood of losing two local copies is slim. And if I had a catastrophic event that took out my laptop and local backups, $90/TB would be small potatoes in the grand scheme of things.
S3 Glacier Deep Archive Tier
The AWS Glacier tier of $0.99 cheaper than BackBlaze’s $5, but there is a minimum storage commitment of 180 days (change or delete a file early and you’ll pay for it upfront) and a retrieval delay of up to 12 hours. Plus, the API fees can cost a lot (especially for lots of small files).
At first, I thought I’d backup my archive data (which rarely changes) to Glacier Deep Archive. I got a huge bill… so the Deep Archive tier charges $0.05 per 1,000 requests (put, copy, post, list). When you have a lot of files, this gets expensive fast, so I shut it down. I do keep most older years compressed but I like to keep the last couple of years open and that can end up being hundreds, if not thousands of dollars per month in list requests.
But you can use Intelligent Tiering, which has more reasonable request charges and still lets you lifecycle files into the Glacier Deep Archive Tier after 180 days–bypassing the Deep Archive list charges.
Intelligent tiering automates storage tiers and smooths out your costs on S3 a bit. I was curious how it compared to B2. My theory is that BackBlaze would be initially cheaper, but AWS would become less expensive within 6-months as things were moved into S3 Deep Archive. For my data set, it turns out the ROI for using AWS S3 is so far out it’s probably not worth it.
Storage Policy Configuration
- For AWS I set up a policy on the Intelligent Tiering Configuration to move files to Deep Archive after 180 days.
- For both AWS and BackBlaze, I set a 30-day versioning rule and expired non-current versions after 30 days. This gives me 30 days worth of immutable backups to help protect against corrupted data due to a bug, mistake, or compromise of the TrueNAS system.
In June of 2022, I set up TrueNAS’s Cloud Sync Task (which uses rclone) to use both S3 and AWS to backup to the cloud once a week on staggering days.
|Month||AWS S3||BackBlaze B2||AWS Running||BackBlaze Running|
9 Month Running Cost: $210.39 (AWS) vs $105.06 (B2).
Even at 9 months, S3 Intelligent Tiering is more expensive than B2’s monthly costs, and the running cost for S3 is double that of B2. Now, at some point, perhaps after 5 or 10 years, there may be an ROI using S3 instead of BackBlaze B2, but perhaps not.
The monthly cost for S3 hasn’t even dropped below B2 yet, and it may never.
If there is savings with S3 that will be negated if I have to restore once. And if I do have to restore, it will take hours to unfreeze files from Glacier Deep Archive. Also, if I ever decide to re-organize my file structure, it’s going to get expensive, resetting back to S3 Standard Pricing for six months.
Now, enough of my data probably changes to keep a good portion in the S3 Standard Tier. I could try to change how my data is stored and optimize it for lower storage costs in S3. But optimizing takes time. If I started considering the value of time to optimize for S3, it just wouldn’t be worth it. The simple option of using B2 is the best choice in my situation.
A few other things you may want to consider.
- AWS replicates S3 data to three availability zones within a region. I assume this means three separate buildings somewhat separated. BackBlaze stores “multiple copies” of your data, but as far as I could tell, there’s no promise it’s spread out across multiple zones. I think S3 is more resilient to disaster than B2.
- BackBlaze seems to be less woke. I haven’t seen any posts or social media promoting leftist agenda as I see all over the place from Amazon. I appreciate a company with the discipline to be focused on their mission.
- Potential egress costs. AWS egress fees are excessive and designed for vendor lock-in. BackBlaze has reasonable egress fees.
- BackBlaze is simple. If something were to happen to me, just about anyone could figure out how to restore data from B2 using TrueNAS. AWS is also a very standard setup, but not everyone will know how to restore files that have been lifecycled into deep archive.
- BackBlaze requires less commitment. If you change your mind or want to re-organize your files, it’s more forgiving than AWS.
- AWS is not entirely honest with its uptime status. I experienced an AWS outage for a while, and nothing on the AWS status page indicated they were down–but there were plenty of comments on Reddit and HN.
I think your mileage may vary. The cost is dependent on the type of data being stored and how often it changes. In my case B2 seems to be the simple and better option.
Oh, and Happy World Backup Day!
Proverbs 22:3 LSB –
A prudent man sees evil and hides, But the simple pass on, and are punished.