How does ZFS save my bacon and lets me sleep comfortably? (Part 3)

Utilizing ZFS for building a powerful Virtual Machine (VM) hosts – Bianor’s QA engineer Deyan Kostov shares his experience and the benefits it brings.

If you have missed the first part of the article, please click here >>>

You thought the story was over? Far from it. Turns out, ZFS was useful in the office too! In case you’ve forgotten, the office is the place where we used to work a long time ago. 🙂 Ah, the good old times… The office also houses a few servers. Our team’s virtualization hosts were getting smallish and slowish for the kinds of workloads we anticipated. So, the team and management decided to buy and provision a new one, in a cost-effective way.

Data Center Servers

The challenge

Our partners in the USA are working with a big potential customer, who wants to manage more than a thousand data centers with millions of items. This requires a huge VM host with lots of CPU cores, RAM, and fast storage to handle the load. So, I dug in my bag of tricks and matched some new components with some second-hand components, and even some third-hand components – namely the ginormous case Fractal Design Define R4.

Here is the rest of the gear:

  • CPU: AMD Ryzen Threadripper 2950X 16c/32t
  • Motherboard: MSI X399 Gaming Pro Carbon AC
  • CPU Cooler: Noctua NH-U14S TR4-SP3
  • RAM: 128GB DDR4
  • 3 SATA SSDs, Intel enterprise 1.6TB each
  • 2 SATA HDDs, 2TB each
  • 1 SATA SSD for the OS
  • 1 SATA SSD for a fast cache of the HDD disk pool
  • А cheap-o graphics card for basic video output

Overall, top of the line two years ago, barely “broken-in” for these types of components. 🙂 Linux drivers and optimizations tend to lag a little, so it’s a good idea to stay off the bleeding edge of hardware. A quick Linux and VM Manager installation, some bridging shenanigans on the two network interfaces – one for the internal network and one for the bridge to our partners in the USA, and presto – we had a new KVM virtualization host.

The result

The hardware was at the end moved to the server room and attached to the network. Job done! And it was not a moment too soon, as that big potential customer wanted some estimates on a large database, which we gave them in a few days. If such a giant VM had to be created in a cloud, like AWS or Azure, and run for 5 days while scripts were creating a million items in the database, the cloud providers would have charged our USA partners about $1000. Not to speak after that, we pumped another 1.5 million items into the database, in case the customer wanted to test an even bigger setup.

So, how is ZFS helping?

First, the 3 SSDs in RAID-Z, are providing fast redundant storage for the VMs that we use every day. We had a responsive app because of the fast storage. HDDs were out of the question, with a million items in the database. The 2 HDDs in the host are in a mirror RAID, for older versions of the product and less important storage, but it is helped by an SSD cache device. So, the most frequently used and the most recently used VMs are fast. The cache has a section for the most frequently used sectors of the disk, and a separate section for the most recently used. It’s a genius feature of ZFS, satisfying the best of both needs. Also, the latest version of ZFS – 2.0 – has made the cache persistent between reboots. It does not need to “learn” what is frequent and recently used after a restart.

Empty file concept

Interestingly (for nerds anyway), ZFS has the concept of an empty file, that does not take space initially but fills with time. And there is transparent compression, so big files filled with zeroes compress very well. When we create the virtual machines with their QCOW2 files for disks, they can be either empty files (thin), or big files filled with zeroes (thick). In both cases, the space on the disk is not occupied until needed. Currently, our VMs take more than 1.5TB of virtual space while that takes only 300GB of physical disk space.

Room for improvement

What’s next? Is there room for improvement? Of course, there always is! Those SATA SSDs can be replaced by NVMe SSDs to increase the disk I/O speed by a factor of six. I’ll be collecting them over the next few weeks from wherever I can find them cheap. Bulgarian merchants put such a high markup, that buying from abroad and paying shipment and tax is actually better. 🙂 Black Friday and Christmas sales, here I come! The freed space in the case can take a couple more HDDs, that I have lying around. Also, some automation scripts should be moved from an older machine to the new one, to improve performance.

I… could… install a crypto miner for Monero… A joke boss, just a joke! 🙂 It will be worthless, anyway. The price of electricity is maybe 100 times higher than the price of Monero that can be mined on a 2-year old CPU. The miners say that even the newest Ryzen-3 5000-series chips can’t keep up. When they release the Ryzen-3 EPYC line of server CPUs in 2021… maybe.

Transition to TrueNAS SCALE

I suppose the next part of this blog will be about the transition to TrueNAS SCALE. By late 2021 it should have it’s first stable release. That will be right about the time when the product will need to move to Linux containers. The application would have to transition from a single VM to clustered containers, as the need to process large databases increases. Big iron solutions are reaching their limits. TrueNAS SCALE is based on Linux, ZFS, KVM, and containers, so it has all the technologies that we will need to test the old and the new product lines. ZFS will make it easy to transition the VMs on the same hardware, I’m sure of it. But as testers, we will verify and report!