FreeBSD, ZFS and Denial of Service

I’ve been using ZFS since FreeBSD 8, and it has it’s uses. It’s pretty be wonderful and all that, but I was actually pretty happy with UFS, and switching to ZFS isn’t a no-brainer.

So what’s the up-side to ZFS? Well you get more error checking and correction and it’s great for managing huge filing systems. You can snapshot and roll back, and do lots of other wonderful stuff with datasets and rive arrays. And it’s more “auto” when it comes to allocating disk space. But call me old fashioned if you like; I don’t like “auto” if I can avoid it.

Penguinistas might not “get” this next bit, but on a UNIX system you didn’t normally have One Big Disk. Instead you had several, and even if you only had one, you’d partition the slice it up so it looked like several. And then, of course, you’d mount disks or partitions on to the root filing system wherever you wanted them to appear.

For reliability, you could also create mirrors and striped RAIDs, put a FS on them and mount them wherever you wanted. And demount them, and mount them somewhere else, and so on.

ZFS does all this good stuff, but automatically, and often as One Big Disk. A good thing? Well… if you must. But there are a few points you might want to consider before diving in.

First off, I like to know where and on which disk my data actually resides. I’m really uneasy with ZFS deciding for me. If ZFS loses it, I want to know where to find it. I also like having a FS on each drive or partition, so I can pull the drive out and mount it wherever I want to get data off – or move it from machine to machine. It’s my data, I’ll do what I want to with it, dammit! You can do this virtually with ZFS datasets, but you can’t unplug a dataset and hold it in your hand. Datasets, of course, are fluid rather than fixed in size, so you don’t need to guess how much space to allocate.

Secondly, with UFS I get to decide what hardware is used for each kind of file. Parts of the FS that are rarely used can be put on slow, cheap, huge disks. The database goes on a velociraptor or better, and the swap partitions – well! Okay, you can use multiple zpools for difference performance situations but then you’re using it like UFS.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

Thirdly, there’s a price for all this ZFS wonderfulness. Apart from the software overhead, the Copy-on-Write business needs a lot of RAM to maintain good performance. Fragmentation no the physical drive is guaranteed. If you’re running software (e.g. a database) that uses random access files and lots of transaction, UFS with its in-place modification wins out. A DBMS will take care of its own consistency and storage optimisation, and it has the edge as it knows what the data represents at the application level.

But what of the Denial of Service problem in the headline? Okay, it’s been a bit of a ramble, but this is something you must consider.

There are always management issues with One Big Disk. Linux users seem oblivious to this, but this doesn’t mean putting everything on a big partition is a great plan – even if you’re using a single disk in practice.

With the old way of having multiple partitions, each with an FS, mounted on the directory tree, when an FS on a partition/drive filled up, it is was full. You couldn’t create more files on it. You either have to delete unwanted stuff, or you can mount a bigger drive in its place. With One Big Disk, when it’s full, it’s also full. The difference is that you can’t write any data anywhere on the entire FS. And this is where DoS comes it.

Take, for example, /var/log. Any UNIX admin with a bit of sense will have this in its own partition. If some script kiddie then did something that caused a lot of log file activity, eventually you’d run out of space in /var/log. But the rest of the system would still be alive. With UFS the default installation process created partitions with sensible sizes. Using the One Big Disk principle, ZFS satisfies the requests of any disk-eating process until there isn’t a single byte left anywhere, and then rolls over saying the zpool is full. Or it would say it if there was a monitor connected to the server in a data centre miles away, and there was someone there to look at it.

With ZFS you can set a limit to the size on a dataset-by-dataset basis and prevent this sort of thing from happening. But it doesn’t happen by default, so set your quotas manually if you’re plonking the OS, and in particular /var on it.

Okay, this might sound a bit anti-ZFS, and I’ve yet to have a disaster with a ZFS system that’s required me to move drives around, so I don’t really know how possible it is when the chips are down. And ZFS has is a nice unified way of doing stuff, rather than fiddling around with geom and the FS separately. But after a couple of years with FreeBSD 10, where it became practical to boot from ZFS, shouldn’t I be feeling a bit more enthusiastic about it?

Having a ZFS pool attached as a data store rather than as a boot device is, of course, a different story. That’s when you see the benefits. But it does also eat resources, so I want the benefits to be worth it for the particular application. For the time being I’m putting the OS on UFS, usually with a data partition for databases to thrash, and userland putting simple files on ZFS – best of both worlds.

One Reply to “FreeBSD, ZFS and Denial of Service”

  1. I always use disk partitions for ZFS vdevs. For some data you may need x-way mirrors, the other – raidz2 or raidz.
    Hard drive speed varies at the start and end area as well.
    Even more – an “operator mistake” can ruin things – with just a single pool you get a total disaster, multiple pools = “things aren’t that bad”.
    ZFS for root fs? not that good idea either, there were multiple cases when “something changed” and ZFS components went out of sync with “no way” to mount the pools. I prefer multiple independent hard drives; should anything happen – it’s easy to swap the 2 drives and get a bootable environment again.
    newsyslog with size limits is your friend for “logs overflowing FS” (2 entries for each log – time and size in newsyslog.conf). ZFS is good in compressing logs on the fly as well – lz4 gives minimal overhead.

Leave a Reply

Your email address will not be published. Required fields are marked *