Networking FreeBSD Jails

Or port forwarding to a jail

I’ve already explained how easy FreeBSD jails are to set up and use without resorting to installing heavy management tools, but today I thought I’d add a bit about networking. Specifically, how do you pass traffic arriving on a particular port to a service running inside a jail?

It’s actually very easy. All you need is a very local network inside FreeBSD, natted to the one outside.

Suppose you have your jail.conf set up as per my previous article. Here’s an excerpt:

tom { ip4.addr = 192.168.0.2 ; }
dick { ip4.addr = 192.168.0.3 ; }
harry { ip4.addr = 192.168.0.4 ; }

The defaults were set earlier in the file; the only thing that’s unique about each jail is the IP4 address and the name. What I didn’t say at the time was that 192.168.0.0 could have been on an internal network.

To define your local network just define it in rc.conf:

cloned_interfaces="lo1"
ipv4_addrs_lo1="192.168.0.1-14/28"

This creates another local loopback interface and assigns a range of IPv4 addresses to it. This can be as large as you wish, but I’ve defined 1..14 (with appropriate subnet mask) because they’ll be listed every time you run ifconfig!

Next you’re going to need something to do the natting. pf us your friend. Enable it in rc.conf too:

pf_enable="yes"

And you’ll need an /etc/pf.conf file to do the magic. I like pf – it’s easier for my brain to understand than most. Here’s an example file:

PUB_IP="192.168.1.217"
INT="bge0"
JAIL_NET="192.168.0.0/24"
TOM="192.168.0.2"
DICK="192.168.0.3"
HARRY="192.168.0.4"
scrub in all
nat pass on $INT from $JAIL_NET to any -> $PUB_IP
block on $INT proto tcp from any to $PUBIP port 111
rdr pass on $INT proto tcp from any to $PUBIP port 3306 -> $TOM
rdr pass on $INT proto tcp from any to $PUBIP port {21,80,443} -> $DICK
rdr pass on $INT proto tcp from any to $PUBIP port 81 -> $HARRY port 80
Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

So what’s going on?

I’ve used a few macros. PUB_IP is your public IP address, and INT is the interface it’s on. pf may figure some of this out, but I’m being explicit.

TOM, DICK and HARRY are the IPv4 addresses of the jails.

Next I’m scrubbing all interfaces (normally a good idea, but you don’t have to). But the next line is important – it uses nat to allow stuff on your jail network to talk to the outside world.

The following line is where you might want to block more stuff – in this case NFS on port 111. Then we’re back to jail things for the final three lines. They’re pretty self-explanatory, but here’s an explanation anyway.

Let’s say the tom jail is running a MariaDB server on port 3306. The first line takes anything arriving on port 3306 and sends it to tom’s jail IP. Simple. It can reply because of the nat line earlier.

dick is running a web and ftp server, so ports 21,80 and 443 are sent there. The pf syntax lets you do nice stuff like this with the {..}

Finally we come to harry. Here we’re running an http server on port 80, but to make it accessible externally we’re mapping it to port 81 as otherwise it would clash with dick. In other words, if you don’t specify a destination port in the redirect it will assume the same as the source port.

And that’s it! When you jail is started you will see an interface lo1 with the IP address defined in /etc/jail.conf and assuming you have something sensible in /etc/resolv.conf you’ll have a jail that looks like it’s running behind a NAT router with port forwarding.

Of course, if you don’t need to map a jailed service to an external IP address, don’t! Jails can access services on each other using their own virtual network.

FreeBSD in Godden Green

What is going on with FreeBSD in Godden Green in Kent, UK? Jobsite has been spamming me with junior/mid-level programmer roles mentioning FreeBSD for months now, and I’m getting curious!

I have an alert set up so whenever FreeBSD is mentioned I get a ping, as I like to know what’s going on. This isn’t one of the usual suspect AFAIK – they might even be interesting!

ZFS is not always the answer. Bring back gmirror!

The ZFS bandwaggon has momentum, but ZFS isn’t for everyone. UFS2 has a number of killer advantages in some applications.

ZFS is great if you want to store a very large number of normal files safely. It’s copy-on-write (COW) is a major advantage for backup, archiving and general data safety, and datasets allow you to fine-tune almost any way you can think of. However, in a few circumstances, UFS2 is better. In particular, large random-access files do badly with COW.

Unlike traditional systems, a block in a file isn’t overwritten in place, it always ends up at a different location. If a file started off contiguous it’ll pretty soon be fragmented to hell and performance will go off a cliff. Obvious victims will be databases and VM hard disk images. You can tune for these, but to get acceptable performance you need to throw money and resources to bring ZFS up to the same level. Basically you need huge RAM caches, possibly an SLOG, and never let your pool get more than 50% full. If you’re unlucky enough to end up at 80% full ZFS turns off speed optimisations to devote more RAM to caching as things are going to get very bad fragmentation-wise.

If these costs are a problem, stuck with UFS. And for redundancy, there is still good old GEOM Mirror (gmirror). Unfortunately the documentation of this now-poor relation has lagged a bit, and what once worked as standard, doesn’t. So here are some tips.

The most common use of gmirror (with me anyway) is a twin-drive host. Basically I don’t want things to fail when a hard disk dies, so I add a second redundant drive. Such hosts (often 1U servers) don’t have space for more than two drives anyway – and it pays to keep things simple.

Setting up a gmirror is really simple. You create one using the “gmirror label” command. There is no “gmirror create” command; it really is called “label”, and it writes the necessary metadata label so that mirror will recognise it (“gmirror destroy” is present and does exactly what you might expect).

So something like:

gmirror label gm0 ada1 ada2

will create a device called /dev/mirror/gm0 and it’ll contain ada1’s contents mirrored on to ada2 (once it’s copied it all in the background). Just use /dev/mirror/gm0 as any other GEOM (i.e. disk). Instead of calling it gm0 I could have called it gm1, system, data, flubnutz or anything else that made sense, but gm0 is a handy reminder that it’s the first geom mirror on the system and it’s shorter to type.

The eagle eyed might have noticed I used ada1 and ada2 above. You’ve booted off ada0, right? So what happens if you try mirroring yourself with “gmirror label gm0 ada0 ada1“? Well this used to work, but in my experience it doesn’t any more. And on a twin-drive system, this is exactly what you want to do. But it is still possible, read on…

How to set up a twin-drive host booting from a geom mirror

First off, before you do anything (even installing FreeBSD) you need to set up your disks. Since the IBM XT, hard disks have been partitioned using an MBR (Master Boot Record) at the start. This is really old, naff, clunky and Microsoft. Those in the know have been using the far superior GPT system for ages, and it’s pretty cross-platform now. However, it doesn’t play nice with gmirror, so we’re going to use MBR instead. Trust me on this.

For the curious, know that GPT keeps a copy of the partition table at the beginning and end of the disk, but MBR only has one, stored at the front. gmirror keeps its metadata at the end of the disk, well away from the MBR but unfortunately in exactly the same spot as the spare GPT. You can hack the gmirror code so it doesn’t do this, or frig around with mirroring geoms rather than whole disks and somehow get it to boot, but my advice is to stick to MBR partitioning or BSDlabels, which is an extension. There’s not a lot of point in ever mounting your BSD boot drive on a non-BSD system, so you’re not losing much whatever you choose.

Speaking of metadata, both GPT and gmirror can get confused if they find any old tables or labels on a “new” disk. GPT will find old backup partition tables and try to restore them for you, and gmirror will recognise old drives as containing precious data and dig its heels in when you try to overwrite it. Both gpart and gmirror have commands to erase their metadata, but I prefer to use dd to overwrite the whole disk with zeros anyway before re-use. This checks that the disk is actually good, which is nice to know up-front. You could just erase the start and end if you were in a hurry and wanted to calculate the offsets.

The next thing you’ll need to do is load the geom_mirror kernel module. Either recompile the kernel with it added, or if this fills you with horror,  just add ‘load_geom_mirror=”yes”‘ to /boot/loader.conf. This does bring it in early enough in the process to let you boot from it. The loader will boot from one drive or the other and then switch to mirror mode when it’s done.

So, at this point, you’ve set up FreeBSD as you like on one drive (ada0), selecting BSDlabels or MBR as the partition method and UFS as the file system. You’ve set it to load the geom_mirror module in loader.conf.  You’re now looking at a root prompt on the console, and I’m assuming your drives are ada0 and ada1, and you want to call your mirror gm0.

Try this:

gmirror label gm0 ada0

Did it work? Well it used to once, but now you’ll probably get an error message saying it could not write metadata to ada0. If (when) this happens I know of one answer, which I found after trying everything else. Don’t be tempted to try everything else yourself (such as seeing if it works with ada1). Anything you do will either fail if you’re lucky, or make things worse. So just reboot, and select single-user mode from the loader menu.

Once you’re at the prompt, type the command again, and this time it should say that gm0 is created. My advice is to now reboot rather than getting clever.

When you do reboot it will fail to mount the root partition and stop, asking for help to find it. Don’t panic. We know where it’s gone. Mount it with “ufs:/dev/mirror/gm0s1a” or whatever slice you had it on if you’ve tried to be clever. Forgot to make a note? Don’t worry, somewhere on the boot long visible on the screen it actually tell you the name of the partition it couldn’t find.

After this you should be “in”. And to avoid this inconvenience next time you boot you’ll need to tweak /etc/fstab using an editor of your choice, although real computer nerds only use vi. What you need to do is replace all references to the actual drive with the gm0 version. Therefore /dev/ada0s1a should be edited to read /dev/mirror/gm0s1a. On a current default install, which no longer partitions the drive, this will only apply the root mount point and the swap file.

Save this, reboot (to test) and you should be looking good. Now all that remains is to add the second drive (ada1 in the example) with the line:

gmirror insert gm0 ada1

You can see the effect by running:

gmirror status

Unless your drive is very small, gm0 will be DEGRADED and it will say something about being rebuilt. The precise wording has changed over time. Rebuilding takes hours, not seconds so leave it. Did I mention it’s a good idea to do this when the system isn’t busy?

How to stop Samba users deleting their home directory and email

Samba Carnival Helsinki summer 2009
Samba Carnival (the real Samba logo is sooo boring)

UNIX permissions can send you around the twist sometimes. You can set them up to do anything, not. Here’s a good case in point…

Imagine you have Samba set up to provide users with a home directory. This is a useful feature; if you log in to the server with the name “fred” you (and only you) will see a network share called “fred”, which contains the files in your UNIX/Linux home directory. This is great for knowledgeable computer types, but is it such a great idea for normal lusers? If you’re running IMAP email it’s going to expose your mail directory, .forward and a load of other files that Windoze users might delete on a whim, and really screw things up.

Is there a Samba option to share home directories but to leave certain subdirectories alone? No. Can you just change the ownership and permissions of the critical files to  root and deny write access? No! (Because mail systems require such files to be owned by their user for security reasons). Can you use permission bits or even an ACL? Possibly, but you’ll go insane trying.

A bit of lateral thinking is called for here. Let’s start with the standard section in smb.conf for creating automatic shares for home directories:

[homes]
    comment = Home Directories
    browseable = no
    writable = yes

The “homes” section is special – the name “homes” is reserved to make it so. Basically it auto-creates a share with a name matching the user when someone logs in, so that they can get to their home directory.

First off, you could make it non-writable (i.e. set writable = no). Not much use to use luser, but it does the job of stopping them deleting anything. If read-only access is good enough, it’s an option.

The next idea, if you want it to be useful, is to use the directive “hide dot files” in the definition. This basically returns files beginning in a ‘.’ as “hidden” to Windoze users, hiding the UNIX user configuration files and other stuff you don’t want deleted. Unfortunately the “mail” directory, containing all your loverly IMAP folders is still available for wonton destruction, but you can hide this too by renaming it .mail. All you then need to do is tell your mail server to use the new name. For example, in dovecot.conf, uncomment and edit the line thus:

mail_location = mbox:~/.mail/:INBOX=/var/mail/%u

(Note the ‘.’ added at the front of ~/mail/)

You then have to rename each of the user’s “mail” folders to “.mail”, restart dovecot and the job is done.

Except when you have lusers who have turned on the “Show Hidden Files” option in Windoze, of course. A surprising number seem to think this is a good idea. You could decide that hidden files allows advanced users control of their mail and configuration, and anyone messing with a hidden file can presumably be trusted to know what you’re doing. You could even mess with Windoze policies to stop them doing this (ha!). Or you may take the view that all lusers and dangerous and if there is a way to mess things up, they’ll find it and do it. In this case, here’s Plan B.

The trick is to know that the default path to shares in [homes] is ‘~’, but you can actually override this! For example:

[homes]
    path = /usr/data/flubnutz
    ...

This  maps users’ home directories in a single directory called ‘flubnutz’. This is not that useful, and I haven’t even bothered to try it myself. When it becomes interesting is when you can add a macro to the path name. %S is a good one to use because it’s the name as the user who has logged in (the service name). %u, likewise. You can then do stuff like:

[homes]
     path = /usr/samba-files/%S
     ....

This stores the user’s home directory files in a completely different location, in a directory matching their name. If you prefer to keep the user’s account files together (like a sensible UNIX admin) you can use:

[homes]
     comment = Home Directories
     path = /usr/home/%S/samba-files
     browseable = no
     writable = yes<

As you can imagine, this stores their Windows home directory files in a sub-directory to their home directory; one which they can’t escape from. You have to create “~/samba-files” and give them ownership of it for this to work. If you don’t want to use the explicit path, %h/samba-files should do instead.

I’ve written a few scripts to create directories and set permissions, which I might add to this if anyone expresses an interest.