ZFS is not always the answer. Bring back gmirror!

The ZFS bandwaggon has momentum, but ZFS isn’t for everyone. UFS2 has a number of killer advantages in some applications.

ZFS is great if you want to store a very large number of normal files safely. It’s copy-on-write (COW) is a major advantage for backup, archiving and general data safety, and datasets allow you to fine-tune almost any way you can think of. However, in a few circumstances, UFS2 is better. In particular, large random-access files do badly with COW.

Unlike traditional systems, a block in a file isn’t overwritten in place, it always ends up at a different location. If a file started off contiguous it’ll pretty soon be fragmented to hell and performance will go off a cliff. Obvious victims will be databases and VM hard disk images. You can tune for these, but to get acceptable performance you need to throw money and resources to bring ZFS up to the same level. Basically you need huge RAM caches, possibly an SLOG, and never let your pool get more than 50% full. If you’re unlucky enough to end up at 80% full ZFS turns off speed optimisations to devote more RAM to caching as things are going to get very bad fragmentation-wise.

If these costs are a problem, stuck with UFS. And for redundancy, there is still good old GEOM Mirror (gmirror). Unfortunately the documentation of this now-poor relation has lagged a bit, and what once worked as standard, doesn’t. So here are some tips.

The most common use of gmirror (with me anyway) is a twin-drive host. Basically I don’t want things to fail when a hard disk dies, so I add a second redundant drive. Such hosts (often 1U servers) don’t have space for more than two drives anyway – and it pays to keep things simple.

Setting up a gmirror is really simple. You create one using the “gmirror label” command. There is no “gmirror create” command; it really is called “label”, and it writes the necessary metadata label so that mirror will recognise it (“gmirror destroy” is present and does exactly what you might expect).

So something like:

gmirror label gm0 ada1 ada2

will create a device called /dev/mirror/gm0 and it’ll contain ada1’s contents mirrored on to ada2 (once it’s copied it all in the background). Just use /dev/mirror/gm0 as any other GEOM (i.e. disk). Instead of calling it gm0 I could have called it gm1, system, data, flubnutz or anything else that made sense, but gm0 is a handy reminder that it’s the first geom mirror on the system and it’s shorter to type.

The eagle eyed might have noticed I used ada1 and ada2 above. You’ve booted off ada0, right? So what happens if you try mirroring yourself with “gmirror label gm0 ada0 ada1“? Well this used to work, but in my experience it doesn’t any more. And on a twin-drive system, this is exactly what you want to do. But it is still possible, read on…

How to set up a twin-drive host booting from a geom mirror

First off, before you do anything (even installing FreeBSD) you need to set up your disks. Since the IBM XT, hard disks have been partitioned using an MBR (Master Boot Record) at the start. This is really old, naff, clunky and Microsoft. Those in the know have been using the far superior GPT system for ages, and it’s pretty cross-platform now. However, it doesn’t play nice with gmirror, so we’re going to use MBR instead. Trust me on this.

For the curious, know that GPT keeps a copy of the partition table at the beginning and end of the disk, but MBR only has one, stored at the front. gmirror keeps its metadata at the end of the disk, well away from the MBR but unfortunately in exactly the same spot as the spare GPT. You can hack the gmirror code so it doesn’t do this, or frig around with mirroring geoms rather than whole disks and somehow get it to boot, but my advice is to stick to MBR partitioning or BSDlabels, which is an extension. There’s not a lot of point in ever mounting your BSD boot drive on a non-BSD system, so you’re not losing much whatever you choose.

Speaking of metadata, both GPT and gmirror can get confused if they find any old tables or labels on a “new” disk. GPT will find old backup partition tables and try to restore them for you, and gmirror will recognise old drives as containing precious data and dig its heels in when you try to overwrite it. Both gpart and gmirror have commands to erase their metadata, but I prefer to use dd to overwrite the whole disk with zeros anyway before re-use. This checks that the disk is actually good, which is nice to know up-front. You could just erase the start and end if you were in a hurry and wanted to calculate the offsets.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

The next thing you’ll need to do is load the geom_mirror kernel module. Either recompile the kernel with it added, or if this fills you with horror,  just add ‘load_geom_mirror=”yes”‘ to /boot/loader.conf. This does bring it in early enough in the process to let you boot from it. The loader will boot from one drive or the other and then switch to mirror mode when it’s done.

So, at this point, you’ve set up FreeBSD as you like on one drive (ada0), selecting BSDlabels or MBR as the partition method and UFS as the file system. You’ve set it to load the geom_mirror module in loader.conf.  You’re now looking at a root prompt on the console, and I’m assuming your drives are ada0 and ada1, and you want to call your mirror gm0.

Try this:

gmirror label gm0 ada0

Did it work? Well it used to once, but now you’ll probably get an error message saying it could not write metadata to ada0. If (when) this happens I know of one answer, which I found after trying everything else. Don’t be tempted to try everything else yourself (such as seeing if it works with ada1). Anything you do will either fail if you’re lucky, or make things worse. So just reboot, and select single-user mode from the loader menu.

Once you’re at the prompt, type the command again, and this time it should say that gm0 is created. My advice is to now reboot rather than getting clever.

When you do reboot it will fail to mount the root partition and stop, asking for help to find it. Don’t panic. We know where it’s gone. Mount it with “ufs:/dev/mirror/gm0s1a” or whatever slice you had it on if you’ve tried to be clever. Forgot to make a note? Don’t worry, somewhere on the boot long visible on the screen it actually tell you the name of the partition it couldn’t find.

After this you should be “in”. And to avoid this inconvenience next time you boot you’ll need to tweak /etc/fstab using an editor of your choice, although real computer nerds only use vi. What you need to do is replace all references to the actual drive with the gm0 version. Therefore /dev/ada0s1a should be edited to read /dev/mirror/gm0s1a. On a current default install, which no longer partitions the drive, this will only apply the root mount point and the swap file.

Save this, reboot (to test) and you should be looking good. Now all that remains is to add the second drive (ada1 in the example) with the line:

gmirror insert gm0 ada1

You can see the effect by running:

gmirror status

Unless your drive is very small, gm0 will be DEGRADED and it will say something about being rebuilt. The precise wording has changed over time. Rebuilding takes hours, not seconds so leave it. Did I mention it’s a good idea to do this when the system isn’t busy?

How to stop Samba users deleting their home directory and email

Samba Carnival Helsinki summer 2009
Samba Carnival (the real Samba logo is sooo boring)

UNIX permissions can send you around the twist sometimes. You can set them up to do anything, not. Here’s a good case in point…

Imagine you have Samba set up to provide users with a home directory. This is a useful feature; if you log in to the server with the name “fred” you (and only you) will see a network share called “fred”, which contains the files in your UNIX/Linux home directory. This is great for knowledgeable computer types, but is it such a great idea for normal lusers? If you’re running IMAP email it’s going to expose your mail directory, .forward and a load of other files that Windoze users might delete on a whim, and really screw things up.

Is there a Samba option to share home directories but to leave certain subdirectories alone? No. Can you just change the ownership and permissions of the critical files to  root and deny write access? No! (Because mail systems require such files to be owned by their user for security reasons). Can you use permission bits or even an ACL? Possibly, but you’ll go insane trying.

A bit of lateral thinking is called for here. Let’s start with the standard section in smb.conf for creating automatic shares for home directories:

    comment = Home Directories
    browseable = no
    writable = yes

The “homes” section is special – the name “homes” is reserved to make it so. Basically it auto-creates a share with a name matching the user when someone logs in, so that they can get to their home directory.

First off, you could make it non-writable (i.e. set writable = no). Not much use to use luser, but it does the job of stopping them deleting anything. If read-only access is good enough, it’s an option.

The next idea, if you want it to be useful, is to use the directive “hide dot files” in the definition. This basically returns files beginning in a ‘.’ as “hidden” to Windoze users, hiding the UNIX user configuration files and other stuff you don’t want deleted. Unfortunately the “mail” directory, containing all your loverly IMAP folders is still available for wonton destruction, but you can hide this too by renaming it .mail. All you then need to do is tell your mail server to use the new name. For example, in dovecot.conf, uncomment and edit the line thus:

mail_location = mbox:~/.mail/:INBOX=/var/mail/%u

(Note the ‘.’ added at the front of ~/mail/)

You then have to rename each of the user’s “mail” folders to “.mail”, restart dovecot and the job is done.

Except when you have lusers who have turned on the “Show Hidden Files” option in Windoze, of course. A surprising number seem to think this is a good idea. You could decide that hidden files allows advanced users control of their mail and configuration, and anyone messing with a hidden file can presumably be trusted to know what you’re doing. You could even mess with Windoze policies to stop them doing this (ha!). Or you may take the view that all lusers and dangerous and if there is a way to mess things up, they’ll find it and do it. In this case, here’s Plan B.

The trick is to know that the default path to shares in [homes] is ‘~’, but you can actually override this! For example:

    path = /usr/data/flubnutz

This  maps users’ home directories in a single directory called ‘flubnutz’. This is not that useful, and I haven’t even bothered to try it myself. When it becomes interesting is when you can add a macro to the path name. %S is a good one to use because it’s the name as the user who has logged in (the service name). %u, likewise. You can then do stuff like:

     path = /usr/samba-files/%S

This stores the user’s home directory files in a completely different location, in a directory matching their name. If you prefer to keep the user’s account files together (like a sensible UNIX admin) you can use:

     comment = Home Directories
     path = /usr/home/%S/samba-files
     browseable = no
     writable = yes<

As you can imagine, this stores their Windows home directory files in a sub-directory to their home directory; one which they can’t escape from. You have to create “~/samba-files” and give them ownership of it for this to work. If you don’t want to use the explicit path, %h/samba-files should do instead.

I’ve written a few scripts to create directories and set permissions, which I might add to this if anyone expresses an interest.