LetsEncrypt, acme.sh and Apachectl reloads

This morning I woke up to an expired TLS certificate on this blog. This is odd, as it’s automatically renewed from LetsEncrypt using acme.sh, kicked off by a cron job. So what went wrong?

I don’t write about LetsEncrypt or ACME much as I don’t understand everything about it, and it keeps surprising me. But I had discovered a problem with FreeBSD running the latest Apache 2.4 in a jail. As I run my web servers in jails, this applies to me.

I like acme.sh. It’s a shell script. Very clever. No dependencies. Dependencies are against my religion. Why anyone would use a more complex system when there’s something simple that works?

For convenience reasons the certificates are renewed outside of a jail, and the sites are created using a script that sets it all up for me. One source of certificates for multiple jails; it’s easier to manage. It manages sites on other hosts using a simple NFS mount.

When you use acme.sh to renew a certificate for Apache you need to be able to plonk something on the web site. This is easy enough – the certificate host (above the jails) can either get direct access through the filing system, or via NFS. It then gets the new certificate and copies it into the right place. When you first issue yourself a certificate you specify the path you want the certificate to go, and the path to the web site. You also specify the command needed to get your web server to reload. It magically remembers this stuff so the cron job just goes along and does them all. But that’s where the fun starts.

I rehosted the blog on a new instance of Apache, and created a new temporary website to make sure SSL worked – getting acme.sh to issue it a certificate in the process. All good, except I noticed that inside a jail, the new version of Apache stops but doesn’t restart after an “apachectl graceful”. The same with “apachectl reload”. Not great, but I tried using “service -j whatever apache24 restart”. A bit drastic but it worked, and I’ve yet to figure out why other methods like “jexec whatever apachectl graceful” stall.

So what happened this morning at 6am? There were some certificates to renew and acme.sh –cron accidentally KOed Apache. It’s the first time any had expired.

Running acme.sh manually between restarting Apache manually worked, but it’s hardly the dream of automation promised by Unix. Debugging the script I found it was issuing a graceful restart command, and I thought I’d specified something more emphatic. So I started grepping for the line in was using, assuming it must be in a config file somewhere. Nothing.

Long story short, I eventually found where it had hidden the command: in .acme.sh/domain.name/domain.name.conf , in spite of having looked there already. It turns out that it’s the line “Le_ReloadCmd=”, and its unique for each domain (sensible idea), but it’s base64 encoded instead of being plain text! And it’s wrapped between “_ACME_BASE64__START_” and “_ACME_BASE64__END_”. I assume this is done to avoid difficulties with certain characters in shell scripts but it’s a bit of a pain to edit it. You can create a new command by piping it through base64 and editing very carefully, but readable it ain’t.

There is an another way – just recopy the certificate. Unfortunately you need to know, and use, the same options as when you originally created it – you can’t just issue a different –reloadcmd. You can check these by looking at the domain.name.conf file, where fortunately these are stored in plain text. Assuming they’re all the same, this little script will do them all for you at once. Adjust as required.

#!/bin/sh

# Make sure you're in the right directory
cd ~/.acme.sh

# Jail containing web site, assumed all the same.
WJAIL=web

for DOM in $(find . -type d -depth 1 | sed "s|^\./||")
do

echo acme.sh $TEST -d $DOM  --install-cert \
        --cert-file /jail/$WJAIL/data/certs/$DOM/cert.pem \
        --key-file /jail/$WJAIL/data/certs/$DOM/cert.key \
        --fullchain-file /jail/$WJAIL/data/certs/$DOM/Fullchain.pem \
        --reloadcmd "service -j $WJAIL apache24 restart"
done

You will notice that this only echos the command needed, so if anyone’s crazy enough to copy/paste it then it won’t do any damage. Remove the “echo” when you’re satisfied it’s doing the right thing for you.

Or you could just edit all the conf files and replace the Le_ReloadCmd= line – you only have to generate it once, after all.

Unix chmod and chown

Following on from Basic UNIX file commands, here’s a bit there wasn’t time for on changing metadata on files.

These two commands change the files permissions and ownership. Permissions is information associated with a file that decides who can do what with it. This was once called the file mode, which is why the command is chmod (Change MODe). Files also have owners, both individuals and group of users, and the command to change this is chown (CHOWNer).

chown is easiest, so I’ll start there. To make a file belong to fred the command is:

chown fred myfile

To change the owning group to accounts:

chown :accounts myfile

And to change both at once:

chown fred:accounts myfile

Changing a files permissions is more tricky, and there are several ways of doing it, but this is probably the easiest to remember. You’ll recall that each file has three sets of permissions: Owner, Group and Other. The permissions themselves are read, write and execute (i.e. it’s an executable program).

chown can set or clear a load of permissions in one go, and the format is basically the type of permission, and ‘+’ or ‘-’ for set or clear, and the permissions themselves. What? It’s probably easier to explain with a load of examples.:

chmod u+w myfile

Allows the user of the file to write to it (u means user/owner)

chmod g+w myfile

Allows any user in the group the file belongs to write to it.

chmod o+r myfile

Allows any user who is not in the files group or the owner to read it. (o means “other”).

You can combine these options

chmod ug+rw

Allows the owner and the group to read and write the file.
chmod go-w

Prevents anyone but the user from being able to modify the file.

If you want to run a program you’ve just written called myprog.

chmod +x myprog

If you don’t specify anything before the +/- chmod assumes you mean yourself.

You might notice an ‘x’ permission on a directory – in this case it means the directory is searchable to whoever has the permission.

Basic UNIX file commands

I was asked me to explain basic Unix shell file manipulation commands, so here goes.

If you’re familiar with MS-DOS, or Windows CMD.EXE and PowerShell (or even CP/M) you’ll know how to manipulate files and directories on the command line. It’s tempting to think that the Unix command line is the same, but there are a few differences that aren’t immediately apparent.

There are actually two main command lines (or shells) for Unix: sh and csh. Others are mainly compatible with these two, with the most common clones being bash and tcsh respectively. Fortunately they’re all the same when it comes to basic commands.

Directory Concepts

Files are organised into groups called “directories”, which are often called “Folders” on Macintosh and Windows. It’s not a great analogy, but it’s visual on a GUI. Unlike the real world, a directory may contain additional directories as well as files. These directories (or sub-directories) can also contain files and more directories and so on. If you drew a diagram you’d end up with something looking like a tree, with directories being branches coming off branches and the files themselves being leaves.
All good trees start with a root from which the rest branches off, and this is no different. The start of a Unix directory tree is known as the root.

Unix has a concept called the Current Working Directory. When a program is looking for a file it is assumed it will be found in the Working Directory if no other location is specified.

Users on a Unix system have an assigned Home Directory, and their Working Directory is initially set to this when the log on.

Users may create whatever files and sub-directories they need within their Home Directory, and the system will allow the do whatever they want with anything they create as it’s owned by them. It’s possible for a normal user to see other directories on the system, in fact it’s necessary, but generally they won’t be able to modify files outside their home directory.

Here’s an example of a directory tree. It starts with the root, /, and each level down adds to the directory “path” to get to the directory.

If you want to know what your current working directory is, the first command we’ll need is “pwd” – or “Print Working Directory”. If you’re ever unsure, use pwd to find out where you are.

Unix commands tend to be short to reduce the amount of typing needed. Many are two letters, and few are longer than four.

The thing you’re most likely to want to do is see a list of files in the current directory. This is achieved using the ls command, which is a shortened form of LiSt.

Typing ls will list the names of the all the files and directories, sorted into ASCII order and arranged into as many columns as will fit on the terminal. You may be surprised to see files that begin with “X” are ahead of files beginning with “a”, but upper case “X” has a lower ASCII value than lower case “a”. Digits 0..9 have a lower value still.

ls has lots of flags to control it’s behaviour, and you should read the documentation if you want to know more of them.

If you want more detail about the files, pass ls the ‘-l’ flag (that’s lower-case L, and means “long form”). You’ll get output like this instead:

drw-r-----  2 fjl  devs        2 Aug 28 13:17 Release
drw-r-----  2 fjl  devs       29 Dec 26  2019 Debug
-rw-r-----  1 fjl  devs     2176 Feb 17  2012 editor.h
-rw-r-----  1 fjl  devs    28190 Feb  7  2012 fbas.c
-rw-r-----  1 fjl  devs    10197 Feb 17  2012 fbas.h
-rw-r-----  1 fjl  devs     5590 Feb 17  2012 fbasexpr.c
-rw-r-----  1 fjl  devs     7556 Feb  3  2012 fbasheap.c
-rw-r-----  1 fjl  devs     7044 Feb  4  2012 fbasio.c
-rw-r-----  1 fjl  devs     4589 Feb  3  2012 fbasline.c
-rw-r-----  1 fjl  devs     4069 Feb  3  2012 fbasstr.c
-rw-r-----  1 fjl  devs     4125 Feb  3  2012 fbassym.c
-rw-r-----  1 fjl  devs    13934 Feb  3  2012 fbastok.c
drw-r-----  3 fjl  devs        3 Dec 26  2019 ipch
-rw-r-----  1 fjl  devs     3012 Feb 17  2012 token.h
:

I’m going to skip the first column for now and look at the third and fourth.

fjl devs

This shows the name of the user who owns the file, followed by the group that owns the file. Unix users can be members of groups, and sometimes it’s useful for a file to be effectively used by a group rather than one user. For example, if you have an “accounts” group and all your accounts staff belong to it, a file can be part of the “accounts” group so everyone can work on it.

Now we’ve covered users and groups we can return to the first column. It shows the file flags, which are various attributes concerning the file. If there’s a ‘-’ then the flag isn’t set. The last nine flags are three sets of three permissions for the file.

The first set are the file owner’s permissions (or rights to use the file).

The second set are the file group’s permissions.

The third are the permissions for any user who isn’t either the owner or in the file’s group

Each group of three represents, in order:

r – Can the file be read, but not changed.
w – can the file be written to. If not set it means you can only read it.
x – can the file be run (i.e. is it a program)

So:

– rw- — — means only the user can read/write the file.

– rw- r– — means the file can be read by anyone in it’s group but only written to by the owner.

– rwx r-x — means the file is a program, probably written by its owner. Others in the group can run it, but no one else can even read it.

There are other special characters that might appear in the first filed for advanced purposes but I’m covering the basics here, and you could write a book on ls.

I’ve missed off the first ‘-’, which isn’t a permission but indicates the type of the file. If it’s a ‘-’ it’s just a regular file. A ‘d’ means it’s actually a directory. You’ll sometimes see ‘c’ and ‘s’ on modern systems, which are normally disk drives and network connections. Unix treats everything like a file so disk drives and network sockets can be found in the directory tree too. You’ll probably see ‘l’ (lower case L) which means it’s a symbolic link – a bit like a .LNK file in Windows.

This brings us to the second column, which is a number. It is the number of times the file exists in the directory tree thanks to their being links, and most cases this will be one. I’ll deal with links later.

The last three columns should be easy to guess: Length, date and finally the name of the file; at least in the case of a regular file.

There are many useful and not so useful options supported by ls. Here are a few that might be handy.

-d

By default, if you give ls a directory name it will show you the contents of the directory. If you want to see the directory itself, most likely because you want to see its permissions, specify -d.

-t

Sort output by time instead of ASCII

-r

Reverse the order of sort. -rtl is useful as it will sort your files with the newest at the end of the list.

-h

Instead of printing a file size in the bytes, which could be a very long number, display in “human readable” format, which restricts it to three characters followed by a suffix: B=bytes, K=Kb, M=Mb and so on.

-F

This is very handy if you’re not using -l, as with just the name printed you can’t tell regular and special files apart. This causes a single character to be added to the file name: ‘*’ means it’s a program (has the x flag set), ‘/’ means it’s a directory and ‘@’ means it’s a symbolic link. Less often you’ll see ‘=’ for a socket, ‘|’ for a FIFO (obsolete) and ‘%’ for a whiteout file (insanity involving union mounts).

Finally, ls takes arguments. By default it lists everything if you give it a list of files and directories, it will just list them.

Where src is a directory, ls src will list the files in that directory. Remember ls -d if you just want information on the directory?

ls src

List everything in the src and obj directories:

ls src obj

Now you can find the names of the files, how do you look at what’s in them? To display the contents of a text file the simple method is cat.

cat test.c

This prints the contents of test.c. You might just want to see the first few lines, so instead try:

head test.c

Only the first ten lines (by default) are printed.

If you want to see the last ten lines, try:

tail test.c

If you want to go down the file a screen full at a time, use.

more test.c

It stops and waits for you to press the space bar after every screen. If you’ve read enough, hit ‘q’ to quit.

less test.c

This is the latest greatest file viewer and it allows you to scroll up and down a file using the arrow keys. It’s got a lot of options.

So far we’ve stayed in our home directory, where we have kept all our files. But evenrtually you’re going to need to organise your files in a hierarchical structure in directories.

To make a directory called “new” type:

mkdir new

This is an mnemonic for “make directory”.

To change your working directory use the chdir command (Change Directory)

chdir new

Most people use the abbreviated synonym for chdir, “cd”, so this is equivalent:

cd new

Once you’re there, type “pwd” to prove we’ve moved:

pwd

If you type ls now you won’t see any files, because it’s empty.

You can also specify the directory explicitly, such as:

cd /usr/home/fred/new

If you don’t start with the root ‘/’, cd will usually start looking for the name of the new directory in the current working directory.

To move back one level up the directory level use this command:

cd ..

You’ll be back in your home directory.

To get rid of the “new” directory use the rmdir command (ReMove DIRectory)

rmdir new

This only works on empty directories, so if there were any files in it you’d have to delete them first. There are other more dangerous commands that will destroy directories and all their contents but it’s better to stick with the safer ones!

To remove an individual file use the rm (ReMove) command, in this case the file being named “unwanted”:

rm unwanted

Normally files are created by applications, but if you want a file to experiment on the easiest way to create one is “touch filename”, which creates an empty file called “filename”. You can also use the echo command:

echo “This is my new text file, do you like it?” > myfile.txt

Echo prints stuff to the screen, but “> myfile.txt” tells Unix to put the output of the echo command into “myfile.txt” instead of displaying it on the screen. We’ll use “echo” more later.

You can display the contents with:

cat myfile.txt

One thing you’re going to want to do pretty soon is copy a file, which is achieved using the cp (CoPy) command:

cp myfile.txt copy-of-myfile.txt

This makes a copy of the file and calls it copy-of-myfile.txt

You can also copy it into a directory

mkdir new
cp myfile.txt new

To see it there, type:

ls -l new

To see the original and the copy, try:

ls - newfile.txt new

If you wanted to delete the copy in “new” use the command:

rm new/myfile.txt

Perhaps, instead of copying your file into “new” you wanted to move it there, so you ended up with only one copy. This is one use of the mv (MoVe) command:

mv myfile.txt new

The file will disappear from your working directory and end up in “new”.

How do you rename a file? There’s no rename command, but mv does it for you. When all is said and done, all mv is doing is changing the name and location of a file.

cd new
mv myfile.txt myfile.text

That’s better – much less Microsoft, much more Unix.

Wildcards

So far we’ve used commands on single files and directories, but most of these commands work with multiple files in one go. We’ve given them a single parameter but we could have used a list.

For example, if we wanted to remove three files called “test”, “junk” and “foo” we could use the command:

rm test junk foo

If you’re dealing with a lot of files you have have the shell create a list of names instead of typing them all. You do this by specifying a sort of “template”, and all the files matching the template will be added to the list.

This might seem the same as Windows, but it’s not – be careful. With Windows the command does the pattern matching according to its context, but the Unix shell has no context and you may end up matching more than you intended, which is unfortunate if you’re about to delete stuff.

The matching against the template is called globbing, and uses the special characters ‘*’ and ‘?’ in it’s simplest form.

‘?’ matches any single character, whereas ‘*’ matches zero or more characters. All other characters except ‘[‘ match themselves. For example:

“?at” would match cat, bat and rat. It would not match “at” as it must have a first character. Neither will it match “cats” as it’s expecting exactly three characters.

“cat*” would match cat, cats, caterpillar and so on.

“*cat*” would match all of the above, as well as “scatter”, “application” and “hellcat”.

You can also specify a list of allowable letters to match between square brackets [ and ], which means any single character will do. You can specify a range, so [0-9] will match any digit. Putting a ‘!’ in front negates the match, so [!0-9] will match any single character that is NOT a digit. If you want to match a two-digit number use [0-9][0-9].

To test globbing out safely, I recommend the use of the echo command for safety. It works like this:

echo Hello world

This prints out Hello world. Useful, eh? But technically what it’s doing is taking all the arguments (aka parameters) one by one and printing them. The first argument is “Hello” so it prints that. The second is “world” so it prints a space and prints that, until there are no arguments left.

Suppose we type this:

echo a*

The Unix shell globs it using the * special character produces a list of all files that start with the letter ‘a’.

You can use this, for example, to specify all the ‘C’ files ending in .c:

echo *.c

If you want to include .h files in this, use

echo *.c *.h

Practice with echo to see how globbing works as it’s non-destructive!

You can also use ls, although this goes on to expand directories into their contents, which can be confusing.

When you have a command that has a source and destination, such as cp (CoPy), they will interpret the everything in the list as a file to be processed apart from the last, which it will expect to be a directory. For example:

cp test junk foo rubbish

Will copy “test”, “junk” and “foo” into an existing directory rubbish.

Now for a practical example. Suppose you have a ‘C’ project where everything is in one directory. .c files, .h files, .o files as well as the program itself. You want to sort this out so the source is in one directory and the objects in another.

Create two directories like this:

mkdir src obj
mv *.c *.h src
mv *.o obj

All done!

Mirrored swap devices

Although some of this is BSD specific, the principles apply to any Unix or Linux.

When you install you Unix like OS across several disks, either with a mirror or RAID system (particularly ZFS RAIDZ) you’ll be asked if you want to set up a swap partition, and if you want it mirrored.

The default (for FreeBSD) is to add a swap partition on every disk and not mirror it. This is actually the most efficient configuration apart from having dedicated swap drives, but is also a spectacularly bad idea. More on this later.

What is a swapfile/drive anyway?

The name is a hangover from early swapping multi tasking systems. Only a few programs could fit in main memory, so when their time allocation ran out they were swapped with others on a disk until it was their turn again.

These days we have “virtual memory”, where a Memory Management Unit (MMU) fixed it so blocks of memory known as pages are stored on disk when not in use and automatically loaded when needed again. This is much more effective than swapping out entire programs but needs MMU hardware, which was once complex, slow and expensive.

So the swap partition should really be called the paging partition now, and Microsoft actually got the name right on Windows. But we still call it the swap partition.

What you need to remember is that parts of a running programs memory may be in the swap partition instead of RAM at any time, and that includes parts of the operating system.

Strategies

There are several ideas for swap partitions in the 2020s.

No swap partition

Given RAM is so cheap, you can decide not to bother with one, and this is a reasonable approach. Virtual memory is slow, and if you can, get RAM instead. It can still pay to have one though, as some pages of memory are rarely, if ever, used again once created. Parts of a large program that aren’t actually used, and so on. The OS can recognise this and page them out, using the RAM for something useful.

You may also encounter a situation where the physical RAM runs out, which will mean no further programs can be run and those already running won’t be able to allocate any more. This leads to two problems: Firstly “Developers” don’t often program for running out of memory and their software doesn’t handle the situation gracefully. Secondly, if the program your need to run is you login shell you’ll be locked out of your server.

For these reasons I find it better to have a swap partition, but install enough RAM that it’s barely used. As a rule of thumb, I go for having the same swap space as there is physical RAM.

Dedicated Swap Drive(s)

This is the classic gold standard. Use a small fast drive (and expensive), preferably short stroked, so your virtual memory goes as fast as possible. If you’re really using VM this is probably the way to go, and having multiple dedicated drives spreads the load and increases performance.

Swap partition on single drive

If you’ve got a single drive system, just create a swap partition. It’s what most installers do.

Use a swap file

You don’t need a drive or even a partition. Unix treats devices and files the same, so you can create a normal file and use that.

truncate -s 16G /var/swapfile
swapon /var/swapfile

You can swap on any number of files or drives, and use “swapoff” to stop using a particular one.

Unless you’re going for maximum performance, this has a lot going for it. You can allocate larger or smaller swap files as required and easily reconfigure a running system. Also, if your file system is redundant, your swap system is too.

Multiple swap partitions

This is what the FreeBSD installer will offer by default if you set up a ZFS mirror or RAIDZ. It spreads the load across all drives. The only problem is that the whole point of a redundant drive system is that it will keep going after a hardware failure. With a bit of swap space on every drive, the system will fail if any of the drives fails, even if the filing system carries on. Any process with RAM paged out to swap gets knocked out, including the operating system. It’s like pulling out RAM chips and hoping it’s not going to crash. SO DON’T DO IT.

If you are going to use a partition on a data drive, just use one. On an eight drive system the chances of a failure on one of eight drives is eight times higher than one one specific unit, so you reduce the probability of failure considerably by putting all your eggs in one basket. Counterintuitive? Consider that if one basket falls on a distributed swap, they all do anyway.

Mirrored swap drives/partitions

This is sensible. The FreeBSD installer will do this if you ask it, using geom mirror. I’ve explained gmirror in posts passem, and there is absolutely no problem mixing it with ZFS (although you might want to read earlier posts to avoid complications with GPT). But the installer will do it automatically, so just flip the option. It’s faster than a swap file, although this will only matter if your job mix actually uses virtual memory regularly. If you have enough RAM, it shouldn’t.

You might think that mirroring swap drives is slower – and to an extent it is. Everything has to be written twice, and the page-out operation will only complete when both drives have been updated. However, on a page-in the throughput is doubled, given the mirror can read either drive to satisfy the request. The chances are there will be about the same, or slightly more page-ins so it’s not the huge performance hit it might seem at first glance.

Summary

MethodProsCons
No swapSimple
Fastest
Wastes RAM
Can lead to serious problems if you run out of RAM
Dedicated Swap Drive(s)Simple
Optimal performance
Each drive is a single point of failure for the whole system
Multiple Swap PartitionsImproved performance
Lower cost than dedicated
Each drive is a single point of failure for the whole system
Single swap partition (multi-drive system)Simple
Lower probability of single point of failure occurring.
Reduced performance
Still has single point of failure
Mirrored drives or partitionsNo single point of failure for the whole systemReduced performance
Swap fileFlexible even on live system
Redundancy the same as drive array
Reduced performance
Quick summary of different swap/paging device strategies.

Conclusion

Having swap paritions on multiple drives increases your risk of a fault taking down a server that would otherwise keep running. Either use mirrored swap partitions/drives, or use a swap file on redundant storage. The choice depends on the amount of virtual memory you use in normal circumstances.

Microsoft sued over Windows 11 debacle

I’m not normally a fan of vexatious litigation, but when someone decides to harras Microsoft over their outrageous move to force 240 million Windows PCs onto the scrap heap I can only applaud.

The heroic litigant is a chap called Lawrence Klein, and is from Southern California in case it wasn’t obvious.

He’s not actually after them for billions, but reckons they’re abusing monopoly power and wants the judge to force them to provide security updates for Windows 10 until it’s only 10% of the installed base. It seems very reasonable to me.

The complaint was filed in San Diego. Mr Klein is right on the button and isn’t holding back.

The text of the complaint can be found on the Courthouse News web site here.

FreeBSD/Linux as Fibre Broadband router

British Telecom, bless them, has decided that copper telephone lines have to go and is forcing everyone onto fibre Internet and VoIP. Except rural customers currently connected to the Internet using a wet piece of string if they’re lucky, of course.

Incidentally, “Fibre Broadband” is a nonsense in a technical sense but the battle is lost – the public believes Broadband is any Internet connection to the home that isn’t dial-up.

Although I’ve written about routing on FreeBSD before, I thought it was time for an update. Why route on FreeBSD? Because unlike the cheap and nasty “routers” supplied by domestic (and some commercial) ISPs, it doesn’t crash. You don’t have to turn it off and on again. And it does what it’s told, with great diagnostics. You can also run plenty of other services on the same box if it’s powerful enough, or your throughput is modest.

Most of this should work fine on Linux, although the networking is generally considered less efficient than the real thing. However, at less than 1Gbps on a single line this isn’t going to matter, if it matters at all. With Linux you get less of the nuts and bolts built in to the base system so you may have to install extra packages depending on which distribution you are using. But this is all standard stuff so shouldn’t be too difficult. It’s the settings that matter, and probably the reason you’re reading this!

In this first article I’ll just consider a gateway router with NAT, and leave DNS, DHCP and other options until later.

Setting up PPPoE using user-ppp

First off, your WAN connection. With FTTC and FTTP this is normally a little white box – either a VDSL modem or an ONT. It connects to the phone line or fibre cable on one end, and has an RJ45 on the other that looks like Ethernet, because it is Ethernet. I’m going to call them Ethernet Modems, as they’re treated the same for our purpose. However, being Ethernet won’t do you much good as it’s just talking a protocol called PPPoE – or Point-to-Point Protocol over Ethernet.

PPP is an old protocol for making an Internet connection using dial-up, but it’s evolved (or suffered mission creep) and it’s now rather complicated thanks to all the baggage. Fortunately you can ignore the baggage and concentrate on the PPPoE stuff, once you know which is which. And that’s always the trick.

You’ll need a host (i.e. computer) with two Ethernet ports unless you want a complicated life. If you’re using an old PC with just one you can get away with a USB3 Ethernet adapter, but having a couple of server-grade NICs on the motherboard or add-on cards is the best way to go. Very generally, Intel or Broadcom are good choices, Realtek is at the low end.

You need to connect your Ethernet Modem to one port on your host and the other port goes to the LAN.

If you Ethernet Modem and the host you’re planning to use as a router are in different places you can connect them using a VLAN. It’s proper Ethernet and can be switched. Without a VLAN it’s not so simple, so plug it in using a direct cable.

PPP is built in (to FreeBSD etc) in the base system. Type ppp (as root) and it’ll start up in interactive mode. If it doesn’t, you’re not using BSD and therefore lack a base system and will have to install it as a package. You might like to start here: https://tldp.org/HOWTO/PPP-HOWTO/

Although you can compile PPP support into the kernel, the ppp we’re talking about is a program written by Toshiharu OHNO and Brian SOMERS in the early 1990s, and part of BSD since FreeBSD and OpenBSD 2. It’s the normal straightforward way of doing things.

ppp has a simple config file in /etc/ppp/ppp.conf. It can contain profiles for multiple services in sections, with the service name being arbitrary, and ending in a colon (“:”). You specify the service when you run it, and stuff in other sections is ignored. This is a hangover from the days when people had multiple dial-up connections.

Here’s a service definition for Cloudscape, one of my favourite ISPs, but other UK FTTP services will be similar or identical. UK FTTC and SoGEA modems are pretty much the same too.

cloudscape:
  delete default                # May already have a
                                # default route configured elsewhere
  set device PPPoE:bge1
  set authname user-name-supplied-by-ISP
  set authkey password-supplied-by-ISP
  set dial
  set login
  set lcp
  set mru 1492
  set mtu 1492
  disable ipv6cp              # Turn off IPv6
  enable ipcp                 # Turn on IPv4 (default)
#  enable lqr                 # Turn on Link Quality Requests
                              #   (detect dropped line)
  enable echo                 # Enable echo for LQR
  iface name wan0
  add default HISADDR

The ppp program was originally used for serial PPP connections to dial-up ISPs or organisations, but here we’re just using it for PPPoE. In support of switching ISPs it can add stuff to config files like resolv.conf and the routing table, which in the old days tended to be dynamic.

Feel free to read the manual that explains what the options above do, but briefly I’m starting by deleting the default route, which probably won’t exist unless you’ve configured it (possibly using DHCP), but if it does will cause problems when ppp adds another.

  set device PPPoE:bge1

This says we’re using PPPoE over the bge1 Ethernet card. Obviously set this to the Ethernet card to which your Ethernet Modem (e.g. ONT) is attached.

  set authname user-name-supplied-by-ISP
  set authkey password-supplied-by-ISP

This is the user-name and password supplied by your ISP. These tend to be low security, but are needed for the protocol for historic reasons.

  set dial
  set login
  set lcp

This will cause ppp to dial, log in and get details using LCP. Some people will try to tell you that internet lines are configured with DHCP – that’s for LANs. LCP (Link Control Protocol) provides the same function, such as what your IP address is and which DNS servers to use, over a point-to-point connection.

  set mru 1492
  set mtu 1492

There are eight bytes of protocol data added to every standard 1500 byte Ethernet frame so won’t fit 1:1 with a PPPoE packet. Reducing the MTU to 1492 gets around this and avoids fragmentation, which is a good thing. LCP might suggest or force a lower MTU but there’s no harm in specifying it.

  disable ipv6cp              # Turn off IPv6
  enable ipcp                 # Turn on IPv4 (default)
#  enable lqr                 # Turn on Link Quality Requests
                              #  (detect dropped line)
  enable echo                 # Enable echo for LQR

This disables IPv6 and enables IPv4 (which is on by default anyway). If you want to use IPv6 your service provider needs to support it, and most don’t.

LQR is probably not going to be necessary for our purposes and generates warnings, so I’ve left the line in but commented it out for now. The enable echo therefore has no effect.

  iface name wan0

By default, ppp will name its connections as tun0, tun1 and so on (tun being Tunnel). This means that you never know what the interface is going to be called, as other tunnels may exist before you start this one. We’re going to be referring to the interface in the PF firewall, so it helps to be sure what its name will be. The line above sets the name manually, and I’ve called in wan0, which is logical. You may, of course, have multiple WAN connections including dial-up backups, so giving them a sensible name is, er, sensible. You can call it anything you like if you’re nuts.

  add default HISADDR

This is an example of ppp messing with your system configuration – in this case it’s taking the IP address supplied by LCP, represented by the macro HISADDR, and adding it as the default route. If you have a static IP address you might want to set it statically in the normal way.

Likewise, if you add the line “enable dns” it will take the DNS servers offered by LCP and add them to resolv.conf. It won’t remove them, and may well end up messing up whatever local DNS arrangements you have, so I prefer to do this manually.

Once you’ve edited ppp.conf you can test it out interactively with “ppp cloudscape” and see what happens. Type “dial” and it should make the connection, and wan0 should appear in your list of network interfaces. Use netstat -r to see if the new default route has appeared.

Setting up the pf firewall

ppp-user is a large program that tries to do everything, including NAT and being a firewall. This isn’t very UNIX-like in philosophy, but you can use these facilities if you like. I prefer to have a dedicated standard firewall, PF, and leave that to do everything firewall-like in one place.

If you’re setting up a router you’re probably going to need asymmetric NAT. Your /etc/pf.conf file will look something like this:

scrub in all
WAN=wan0
WANIP=1.2.3.4
nat pass on $WAN from 192.168.1.0/24 to any -> $WANIP
#rdr pass on $WAN proto tcp from any to $WANIP port 80 -> 192.168.1.123

The WAN IP comes from your ISP, although you will be able to see it using “ifconfig wan0:” if you don’t have it. I’m assuming your LAN is 192.168.1.0/24 – just set this to whatever you’re using. And that’s about it.

As a bonus, the commented out example line at the end would external port 80 to a web server on LAN address 192.168.1.123 – an open port. Peter Hansteen has written an excellent book on PF, called “The Book of PF”, which will tell you everything you need to know, and it’s well documented in various online handbooks and man pages, unlike ppp-user’s built in firewall.

The only reason for using user-ppp for NAT is if you’re on a dynamic IP address, in which case and “enable nat” and add ppp_nat=yes to /etc/rc.conf

Kicking it all off

First you need to enable routing:

 sysctl net.inet.ip.forwarding=1

This will work until reboot, and you can turn it off again by setting it to zero if something bad happens, like your NIC catching fire. Then dial your ISP (Cloudscape in this example)

ppp -ddial cloudscape

You should now have a connection to the Internet on the BSD box. Now enable PF for NAT.

service pf start (or onestart)

Of it it’s running, use “service pf reload” to load the new config. At this point every machine on the LAN should be able to use your LAN IP address as a gateway.

When you’re happy it works, to make this kick off automatically on boot, modify /etc/rc.conf:

sysrc ppp_enable=yes
sysrc ppp_mode=ddial
sysrc ppp_profile="cloudscape"
sysrc pf_enable=yes
sysrc gateway_enable=yes

Optionally “sysrc ppp_nat=yes” if you’re not using PF for NAT. Or if you’re editing rc.conf directly:

pf_enable=yes
gateway_enable=yes

ppp_enable="YES"
ppp_mode="ddial"
#ppp_nat="YES"	# We let PF do NAT
ppp_profile="name_of_service_provider"

I will do a part two to this post explaining how to configure DNS and DHCP, although there’s no reason these need to be on the same host you’re using as a router. In fact it’s good practice to separate them and have more than one DHCP and DNS server if you have the resources.

I hope you found it useful – any questions add a comment below.

How to tell if a host is up without ping

Some people seem to think that disabling network pings (ICMP echo requests to be exact) is a great security enhancement. If attackers can’t ping something they won’t know it’s there. It’s called Security through Obscurity and only a fool would live in this paradise.

But supposing you have something on your network that disables pings and you, as the administrator, want to know if it’s up? My favourite method is to send an ARP packet to the IP address in question, and you’ll get a response.

ARP is how you translate an IP address into a MAC address to get the Ethernet packet to the right host. If you want to send an Ethernet packet to 1.2.3.4 you put out an ARP request “Hi, if you’re 1.2.3.4 please send your MAC address to my MAC address”. If a device doesn’t respond to this then it can’t be on an Ethernet network with an IP address at all.

You can quickly write a program to do this in ‘C’, but you can also do it using a shell script, and here’s a proof of concept.

#!/bin/sh
! test -n "$1" && echo $0: Missing hostname/IP && exit
#arp -d $1  >/dev/null 2>/dev/null
ping -t 1 -c 1 -q $1 >/dev/null
arp $1 | grep -q "expires in" && echo $1 is up. && exit
echo $1 is down.

You run this with a single argument (hostname or IP address) and it will print out whether it is down or up.

The first line is simply the shell needed to run the script.

Line 2 bails out if you forget to add an argument.

Line 3, which is commented out, deletes the host from the ARP cache if it’s already there. This probably isn’t necessary in reality, and you need to be root user to do it. IP address mappings are typically deleted after 20 minutes, but as we’re about to initiate a connection in line 4 it’ll be refreshed anyway.

Line 4 sends a ping to the host. We don’t care if it replies. The timeout is set to the minimum 1 second, which means there’s a one second delay if it doesn’t reply. Other ways of tricking the host into replying exist, but every system has ping, so ping it is here.

Live 5 will print <hostname> is up if there is a valid ARP cache entry, which can be determined by the presence of “expires in” in the output. Adjust as necessary.

The last line, if still running, prints <hostname> is down. Obviously.

This only works across Ethernet – you can’t get an ARP resolution on a different network (i.e. once the traffic has got through a router). But if you’re on your organisation’s LAN and looking to see if an IoT devices is offline, lost or stolen then this is a quick way to poll it and check.

Why can’t I ping my Amazon Echo?

The simple answer is that the current Amazon Echo devices don’t respond to a ping – or technically an ICMP echo request. There’s a lot of waffle on the web saying this is because they’re too simple to do it, but this isn’t the case. The original Echo (at least before software updates) and the Echo Show 8” most certainly did respond to a ping, but the functionality has been dropped since then. Some people naively think that it’s a security risk, part of a doctrine known as Security Through Obscurity. As it’s easy enough to find an Echo without a ping, it’s only a slight inconvenience to a would-be attacker and a big inconvenience to an network administrator.

Most later Echos do have open ports, however, so you can check to see if it’s alive because the port will be there. I emphasise “open”, as Echos use quite a lot of ports that aren’t always open, for things like setup or communicating out. But these ports are open and can be connected to – even if the connection is refused it shows there’s something there to refuse it.

Based on my incomplete collection of Echo devices, they have the following characteristics:

ModelPing?Ports
Original Echo
Echo Dot fourth Generation1080, 6543, 8888
Echo Flex1080, 8888
Echo Dot Second Generation1080, 8888
Echo Dot Third Generation1080, 8888
Echo Show 8-inch (second generation)Y8009
Echo Spot first Generation
Echo Show 5-inch

So how can you reliably tell if your Amazon Echo device is alive on the network? Rather than messing around with ports, my favorite way is to send it an ethernet ARP request and see if you get a reply. I did say disabling ping was a fools solution to security.

See here for how to do this.

Add mirror to single ZFS disk

So you have FreeBSD a single drive ZFS machine and you want to add a second drive to mirror the first because it turns out it’s now important. Yes, it’s possible to do this after installation, even if you’re booting off ZFS.

Let’s assume your first drive is ada0, and it’s had the FreeBSD installer set it up a a “stripe on one drive” using GPT partition. You called the existing zpool “zroot” as you have no imagination whatsoever. In other words everything is the default. The new disk is probably going to be ada1 – plug it in and look on the console or /var/messages to be sure. As long as it’s the same size or larger than the first, you’re good to go. (Use diskinfo -v if you’re not sure).

FreeBSD sets up boot partitions and swap on the existing drive, and you’ll probably want to do this on the new one, if for no other reason than if ada0 fails it can boot off ada1.

gpart destroy -F ada1
gpart backup ada0 | gpart restore ada1
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1

This gets rid of any old partition table that might be there, copies the existing one from ada0 (which will include the boot and swap partitions as well as the ZFS one).

The third line installs a protective MBR on the disk to avoid non-FreeBSD utilities doing bad things and then adds the ZFS boot code.

If there’s a problem, zero the disk using dd and try again. Make sure you zap the correct drive, of course.

dd if=/dev/zero of=/dev/ada1 bs=32m status=progress

Once you’ve got the partition and boot set up, all you need to do is attach it to the zpool. This is where people get confused as if you do it wrong you may end up with a second vdev rather than a mirror. Note that the ZFS pool is on the third partition on each drive – i.e. adaxp3.

The trick is to specify both the existing and new drives:

zpool attach zroot ada0p3 ada1p3

Run zpool status and you’ll see it (re)silvering the new drive. No interruptions, no reboot.

pool: zroot
state: ONLINE
scan: resilvered 677M in 00:00:18 with 0 errors on Sat Apr 5 16:13:16 2025
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada0p3 ONLINE 0 0 0
ada1p3 ONLINE 0 0 0

This only took 18 seconds to resilver as in this case it’s just a system diskm and ZFS doesn’t bother copying unnecessary blocks.

If you want to remove it and go back to a single drive the command is:

zpool detach zroot ada1p3

Add another to create a three-way mirror. Go a little crazy!

Set up FreeBSD in two mirrored drives using UFS

I’ve written about the virtues of Geom Mirror (gmirror) in the past. Geom Mirror was probably the best way of implementing redundant storage between FreeBSD 5.3 (2004) until ZFS was introduced in FreeBSD 7.0 in 2008. Even then, ZFS is heavyweight and the Geom Mirror was tested and more practical for many years afterwards.

The Geom system also has a RAID3 driver. RAID3 is weird. It’s the one using a separate parity drive. It works, but it wasn’t popular. If you had a big FreeBSD system and wanted an array it was probably better to use an LSI host bus adapter and have that manage it with mptutil. But for small servers, especially remotely managed, Geom Mirror was the best. I’m still running it on a few twin-drive servers, and will probably continue for some time to come.

The original Unix File System (UFS2) actually has a couple of advantages over ZFS. Firstly it has much lower resource requirements. Secondly, and this is a big one, it has in-place updates. This is a big deal with random access files, such as databases or VM hard disks, as the Copy-on-Write system ZFS uses fragments the disk like crazy. To maintain performance on a massively fragmented file system, ZFS requires a huge amount of cache RAM.

What you need for random access read/write files are in-place updates. Database engines handle transaction groups themselves to ensure that the data structure’s integrity is maintained. ZFS does this at the file level instead of application level, which isn’t really good enough as the application knows what is and what isn’t required. There’s no harm in ZFS doing it too, but it’s a waste. And the file fragmentation is a high price to pay.

So, for database type applications, UFS2 still rules. There’s nothing wrong with having a hybrid system with both UFS and ZFS, even on the same disk. Just mount the UFS /var onto the ZFS tree.

But back to the twin drive system: The FreeBSD installed doesn’t have this as an option. So here’s a handy dandy script wot I rote to do it for you. Boot of a USB stick or whatever and run it.

Script to install FreeBSD on gmirror

Use as much or as little as you like.

At the beginning of the script I define the two drives I will be using. Obviously change these! If the disks are not blank it might not work. The script tries to destroy the old partition data but you may need to do more if you have it set up with something unusual.

Be careful – it will delete everything on both drives without asking!

Read the comments in the script. I have set it up to use a 8g UFS partition, but if you leave out the “-s 8g” the final partition will use all the space, which is probably what you want. For debugging I kept it small.

I have put everything on a single UFS partition. If you want separate / /usr /var then you need to modify it to what you need and create a mirror for each (and run newfs for each). The only think is that I’ve created a swap file on each drive that is NOT mirrored and configured it to use both.

I have not set up everything on the new system, but it will boot and you can configure other stuff as you need by hand. I like to connect to the network and have an admin user so I can work on a remote terminal straight away, so I have created an “admin” user with password “password” and enabled the ssh daemon. As you probably know, FreeBSD names its Ethernet adapters by manufacturer and you don’t know what you’ll have so I just have it try DHCP on every possible interface. Edit the rc.conf file how you need it once it’s running.

If base.txz and kernel.txz are in the current directory, fine. The script tries to download them at present.

And finally, I call my mirrors m0, m1, m2 and so on. Some people like to use gm0. It really doesn’t matter what you call them.

#!/bin/sh
# Install FreeBSD on two new disks set up a a gmirror
# FJL 2025
# Edit stuff in here as needed. At present it downloads
# FreeBSD 14.2-RELEASE and assumes the disks
# in use are ada0 and ada1

# Fetch the OS files if needed (and as appropriate)
fetch https://download.freebsd.org/ftp/releases/amd64/14.2-RELEASE/kernel.txz
fetch https://download.freebsd.org/ftp/releases/amd64/14.2-RELEASE/base.txz

# Disks to use for a mirror. All will be destroyed! Edit these. The -xxxx
# is there to save you if you don't
D0=/dev/da1-xxxxx
D1=/dev/da2-xxxxx

# User name and password to set up initial user.
ADMIN=admin
ADMINPASS=password

# Make sure the geom mirror module is loaded.
kldload geom_mirror

# Set up the first drive
echo Clearing $D0
gpart destroy -F $D0
dd if=/dev/zero of=$D0 bs=1m count=10

# Then create p1 (boot), p2 (swap) and p3 (ufs)
# Note the size of the UFS partition is set to 8g. If you delete
# the -s 8g it will use the rest of the disk by default. For testing
# it's better to have something small so newfs finishes quick.

echo Creating gtp partition on $D0
gpart create -s gpt $D0
gpart add -t freebsd-boot -s 512K $D0
gpart add -t freebsd-swap -s 4g $D0
gpart add -t freebsd-ufs -s 8g $D0

echo Installing boot code on $D0
# -b installs protective MBR, -i the Bootloader.
# Assumes partition 1 is freebsd-boot created above.
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $D0

# Set up second drive
echo Clearing $D1
gpart destroy -F $D1
dd if=/dev/zero of=$D1 bs=1m count=10

# Copy partition data to second drive and put on boot code
gpart backup $D0 | gpart restore $D1
gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $D1

# Mirror partition 3 on both drives
gmirror label -v m0 ${D0}p3 ${D1}p3

echo Creating file system
newfs -U /dev/mirror/m0
mkdir -p /mnt/freebsdsys
mount  /dev/mirror/m0 /mnt/freebsdsys

echo Decompressing Kernel
tar -x -C /mnt/freebsdsys -f kernel.txz
echo Decompressing Base system
tar -x -C /mnt/freebsdsys -f base.txz

# Tell the loader where to mount the root system from
echo 'geom_mirror_load="YES"' > /mnt/freebsdsys/boot/loader.conf
echo 'vfs.root.mountfrom="ufs:/dev/mirror/m0"' \
>> /mnt/freebsdsys/boot/loader.conf

# Set up fstab so it all mounts.
echo $D0'p2 none swap sw 0 0' > /mnt/freebsdsys/etc/fstab
echo $D1'p2 none swap sw 0 0' >> /mnt/freebsdsys/etc/fstab
echo '/dev/mirror/m0 / ufs rw 1 1' >> /mnt/freebsdsys/etc/fstab

# Enable sshd and make ethernet interfaces DHCP configure
echo 'sshd_enable="YES"' >/mnt/freebsdsys/etc/rc.conf
for int in em0 igb0 re0 bge0 alc0 fxp0 xl0 ue0 igb0 xcgbe0 bnxt0 mlx0
do
echo 'ifconfig_'$int'="DHCP"' >>/mnt/freebsdsys/etc/rc.conf
done

# Create initial user suitable for ssh login
pw -R /mnt/freebsdsys useradd $ADMIN -G wheel -m
echo "$ADMINPASS" | pw -R /mnt/freebsdsys usermod -n $ADMIN -h 0
echo "$ADMINPASS" | openssl passwd -6 -stdin | pw -R /mnt/freebsdsys usermod -n $ADMIN -H 0

# Tidy up
umount /mnt/freebsdsys
echo Done. Remove USB stick or whatever and reboot.