ZFS is not always the answer. Bring back gmirror!

The ZFS bandwaggon has momentum, but ZFS isn’t for everyone. UFS2 has a number of killer advantages in some applications.

ZFS is great if you want to store a very large number of normal files safely. It’s copy-on-write (COW) is a major advantage for backup, archiving and general data safety, and datasets allow you to fine-tune almost any way you can think of. However, in a few circumstances, UFS2 is better. In particular, large random-access files do badly with COW.

Unlike traditional systems, a block in a file isn’t overwritten in place, it always ends up at a different location. If a file started off contiguous it’ll pretty soon be fragmented to hell and performance will go off a cliff. Obvious victims will be databases and VM hard disk images. You can tune for these, but to get acceptable performance you need to throw money and resources to bring ZFS up to the same level. Basically you need huge RAM caches, possibly an SLOG, and never let your pool get more than 50% full. If you’re unlucky enough to end up at 80% full ZFS turns off speed optimisations to devote more RAM to caching as things are going to get very bad fragmentation-wise.

If these costs are a problem, stuck with UFS. And for redundancy, there is still good old GEOM Mirror (gmirror). Unfortunately the documentation of this now-poor relation has lagged a bit, and what once worked as standard, doesn’t. So here are some tips.

The most common use of gmirror (with me anyway) is a twin-drive host. Basically I don’t want things to fail when a hard disk dies, so I add a second redundant drive. Such hosts (often 1U servers) don’t have space for more than two drives anyway – and it pays to keep things simple.

Setting up a gmirror is really simple. You create one using the “gmirror label” command. There is no “gmirror create” command; it really is called “label”, and it writes the necessary metadata label so that mirror will recognise it (“gmirror destroy” is present and does exactly what you might expect).

So something like:

gmirror label gm0 ada1 ada2

will create a device called /dev/mirror/gm0 and it’ll contain ada1’s contents mirrored on to ada2 (once it’s copied it all in the background). Just use /dev/mirror/gm0 as any other GEOM (i.e. disk). Instead of calling it gm0 I could have called it gm1, system, data, flubnutz or anything else that made sense, but gm0 is a handy reminder that it’s the first geom mirror on the system and it’s shorter to type.

The eagle eyed might have noticed I used ada1 and ada2 above. You’ve booted off ada0, right? So what happens if you try mirroring yourself with “gmirror label gm0 ada0 ada1“? Well this used to work, but in my experience it doesn’t any more. And on a twin-drive system, this is exactly what you want to do. But it is still possible, read on…

How to set up a twin-drive host booting from a geom mirror

First off, before you do anything (even installing FreeBSD) you need to set up your disks. Since the IBM XT, hard disks have been partitioned using an MBR (Master Boot Record) at the start. This is really old, naff, clunky and Microsoft. Those in the know have been using the far superior GPT system for ages, and it’s pretty cross-platform now. However, it doesn’t play nice with gmirror, so we’re going to use MBR instead. Trust me on this.

For the curious, know that GPT keeps a copy of the partition table at the beginning and end of the disk, but MBR only has one, stored at the front. gmirror keeps its metadata at the end of the disk, well away from the MBR but unfortunately in exactly the same spot as the spare GPT. You can hack the gmirror code so it doesn’t do this, or frig around with mirroring geoms rather than whole disks and somehow get it to boot, but my advice is to stick to MBR partitioning or BSDlabels, which is an extension. There’s not a lot of point in ever mounting your BSD boot drive on a non-BSD system, so you’re not losing much whatever you choose.

Speaking of metadata, both GPT and gmirror can get confused if they find any old tables or labels on a “new” disk. GPT will find old backup partition tables and try to restore them for you, and gmirror will recognise old drives as containing precious data and dig its heels in when you try to overwrite it. Both gpart and gmirror have commands to erase their metadata, but I prefer to use dd to overwrite the whole disk with zeros anyway before re-use. This checks that the disk is actually good, which is nice to know up-front. You could just erase the start and end if you were in a hurry and wanted to calculate the offsets.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

The next thing you’ll need to do is load the geom_mirror kernel module. Either recompile the kernel with it added, or if this fills you with horror,  just add ‘load_geom_mirror=”yes”‘ to /boot/loader.conf. This does bring it in early enough in the process to let you boot from it. The loader will boot from one drive or the other and then switch to mirror mode when it’s done.

So, at this point, you’ve set up FreeBSD as you like on one drive (ada0), selecting BSDlabels or MBR as the partition method and UFS as the file system. You’ve set it to load the geom_mirror module in loader.conf.  You’re now looking at a root prompt on the console, and I’m assuming your drives are ada0 and ada1, and you want to call your mirror gm0.

Try this:

gmirror label gm0 ada0

Did it work? Well it used to once, but now you’ll probably get an error message saying it could not write metadata to ada0. If (when) this happens I know of one answer, which I found after trying everything else. Don’t be tempted to try everything else yourself (such as seeing if it works with ada1). Anything you do will either fail if you’re lucky, or make things worse. So just reboot, and select single-user mode from the loader menu.

Once you’re at the prompt, type the command again, and this time it should say that gm0 is created. My advice is to now reboot rather than getting clever.

When you do reboot it will fail to mount the root partition and stop, asking for help to find it. Don’t panic. We know where it’s gone. Mount it with “ufs:/dev/mirror/gm0s1a” or whatever slice you had it on if you’ve tried to be clever. Forgot to make a note? Don’t worry, somewhere on the boot long visible on the screen it actually tell you the name of the partition it couldn’t find.

After this you should be “in”. And to avoid this inconvenience next time you boot you’ll need to tweak /etc/fstab using an editor of your choice, although real computer nerds only use vi. What you need to do is replace all references to the actual drive with the gm0 version. Therefore /dev/ada0s1a should be edited to read /dev/mirror/gm0s1a. On a current default install, which no longer partitions the drive, this will only apply the root mount point and the swap file.

Save this, reboot (to test) and you should be looking good. Now all that remains is to add the second drive (ada1 in the example) with the line:

gmirror insert gm0 ada1

You can see the effect by running:

gmirror status

Unless your drive is very small, gm0 will be DEGRADED and it will say something about being rebuilt. The precise wording has changed over time. Rebuilding takes hours, not seconds so leave it. Did I mention it’s a good idea to do this when the system isn’t busy?

Leave a Reply

Your email address will not be published. Required fields are marked *