Simple guide to the vi editor

vi is the standard text editor that’s been around on Unix and Unix-like systems since 1976. It was written by Bill Joy in one weekend as an enhancement of the original ex editor, which lacked “full screen” mode because, to put it bluntly, full screen terminals hadn’t been invented.

It’s part of the POSIX standard and you’ll never find a Unix that hasn’t got it. Love it or hate it, vi is standard; so you’d better make friends with it because you’re going to need it, even if it’s only briefly while you compile something more modern as a replacement.

There are numerous tutorials and “cheat sheets” for vi, but they’re over-complicated to my mind. Yes, you can do a lot with ex and vi if you remember and type various key sequences correctly, but most people just want to edit a text file, quickly. There are much better editors out there for big jobs, although they weren’t written in a weekend.

So how do you use it, if you don’t want to remember more complex keystrokes than necessary?

The modes

The first thing you need to understand about vi is that it has “modes”. With any normal editor, you move the cursor to wherever you want to type something, and type away. Not so with vi, but it’s not that bad when you understand it. When you first start vi it’s in Command Mode, which would be better described as “move the cursor mode”. It’s waiting for you to move the cursor somewhere. You can enter a command by prefixing it with a colon (“:”), when it will jump the cursor to the bottom of the screen so you can enter the command. Other modes let you enter text or search for things. To return to the Command Mode (i.e. move the cursor mode) you generally press the [Esc] key. Remember that if you’re stuck.

In Command Mode most of the keys do things, usually moving the cursor, often in expected ways. Be careful what you press.

In this text I’m representing the Enter key on the keyboard with [Enter], and Escape with [Esc]. Other characters in double quotes to make them stand out, but are typed without the double quotes. So let’s get started…

Loading a file to edit

If you’re new to vi you might want to make a backup of the file you’re about to mess with. There are mechanisms to have vi do this for you, but for safety just make a copy yourself:

cp filename filename.safe

Then on to edit the file. Type “vi filename”

This will load the file named filename into the editor. To make the editor a little more friendly you can type:

:set verbose showmode

Followed by Enter.

Quitting or saving a file.

This uses a colon command (see above if you skipped the introduction). To quit type :q followed by the Enter key. If you’ve made changes to the file it will say “File modified since last complete write” or similar, and won’t let you. You can override this by adding a ! to the command to show you really mean it

:q! [Enter]

To write changes to disk use :w <enter> This is the equivalent to “Save” on a modern editor. To write the current file to another file (i.e. Save As…) use :w newfilename [Enter]

A quick shortcut if you want to save and exit the editor in one go is :x [Enter]

Moving around the file

When the file is loaded you’re in Command Mode, which might more usefully be thought of as Movement Mode. You can’t type anything but you can move the cursor around. On anything reasonably modern you can do this using the arrow keys on the keyboard and [PgUp] and [PgDn]. [Home] and [End] probably work too. There are other ways of doing this using original keyboards that lacked cursor keys, which I’ll cover later if needed.

Changing stuff

Let’s assume you’ve got the cursor to the place in the file where you want to make the change. Most of the time you’ll want to either delete stuff or insert stuff. To delete what’s under the cursor type the “x” key. If you want to delete a lot, “D” deletes everything to the end of the line and “dd” deletes the whole line, but if you can’t remember just stick to “x”. Note that it’s case sensitive – “D” and “d” are not the same.

To insert something, type the “i” key to get into Insert Mode. Other modes are available, but Insert Mode is what most people are used to.

When you’re in Insert Mode, everything you type will be inserted, including new lines if you press the return key. To get out of insert mode press the [Esc] key. And this is a general rule, if you’re in a mess with vi keep hitting the [Esc] key until you get back into Command Mode.

You might, of course, have made a mess of the edit. A single “u” will undo the last change you made in Insert Mode. If you press “u” a second time it will toggle the undone changes back.

If you make a real mess of it just quit out using “:q!” and start again. It sometimes pays to do a “:w” while editing to save good changes.

Cut+paste with a mouse

vi has all sorts of ways of moving text around, but to keep it simple I’m going to assume you’re using a virtual terminal (something like PuTTY) and have mouse. Just select the text you want to copy, move to where you want it to go and put vi in insert mode with “i”. If you can’t remember which mode you’re in type [Esc] “i” to be sure. Then right-click the mouse and it will paste.

Search and replace

To search for something in the file hit [Esc] to make sure you’re in Command Mode and type “/” followed by whatever you’re looking for, followed by Enter. If you want to search backwards use “?” instead of “/”. The cursor will jump to the first occurrence it finds.

If you want to search for the next occurrence use the “n” key, if you want to go backwards use the capital “N”.

To do a Search and Replace you’ll just have to go with me on this. Again, make sure you’re in Command Mode and hit “:” for a colon command. To replace “old” with “new” in the entire file the colon command looks like this:

:%s/old/new/g

Basically the %s means search every line and the g means change every occurrence in a line. The “/” marks the old and new fields. If you want to replace something with a “/” character in it use a different separator character like “.” or “|” – it’s just looking for a punctuation mark and it will carry on using whatever punctuation mark it finds first.

That’s it?

You want it to be more complicated? I’ve seen many tutorials and cheat sheets explain a lot of stuff you don’t need for simple editing. Commands to move the cursor quickly, repeated commands and so on. Yes, you can scroll ten lines down using 10j but who cares? Just hit the down arrow ten times – assuming you even know that your destination is that far below. I might add a batch of secondary very useful commands later but they’re not essential. However, read on if you don’t have cursor keys or can’t use a mouse for cut and paste, or care about line numbers.

Line numbers

If you care which line you’re on, possibly because you’re getting a message line “error in line 123” of your config file and need to fix it, there are a few extras that might help. If you want to jump to line 123 use “:123”. As you can imagine, to get to the top of the file quickly you can also use “:1”, and use “:$” to go to the bottom.

You can turn on a display of the current line number and cursor column with “:set ruler” and display line numbers with “:set number”. To turn these things off prefix them with “no” – e.g. “:set nonumber

Cut+paste with keyboard

If you’re using a real hardware terminal instead of a virtual software one (with a mouse) you can still cut and paste, but I’ll need to explain something about buffers and deleting first. When you delete anything in vi it goes into a buffer. The basic delete command is “d” followed by any movement command. Anything between the old and new cursor positions gets deleted and put in the buffer. There are a lot of movement commands, but the one we’re interested in is “jump to marker”.

To place a marker in the file at the cursor position you must be in Command Mode (hit [Esc] if you’re not sure) and then type “ma”. This sets marker “a”. You can also set marker b, c, d and so on, which is useful if you want to bookmark different places in a file.

To jump to a marker use a “ ` ” (single back quote, normally to the left of the “1” on the keyboard) followed by the marker letter – i.e. `a

So to cut some text, set the marker at the start, move to the end and type d`a and the text will disappear into the paste buffer. If you wanted to Copy instead of Cut use y`a instead – y is Copy.

In order to paste the contents of the buffer, go to where you need it and type capital “P”. This will insert at the current cursor position, which is what most people want.

A variation in this is to use a ‘ instead of a ` (a single quote instead of a backquote), which will cut/paste whole lines between the cursor and the marker. Beware, a lower case “p” instead of a capital for paste puts the buffer after the current cursor position by either one character or one line depending on whether you used a ` or a ‘ originally. Stick with P to avoid confusion.

No cursor keys?

If your keyboard doesn’t have cursor keys, or they’re not working, you can use the following to move around:

left=h, right=l, k=up, j=down. Ctrl-B and Ctrl-F give you Page Up and Page Down.

A note on VIM

vim is a “Vi Improved” editor found in GNU/Linux. It claims to be mostly compatible with VI and most of the above should work. One of its improvements is a multi-buffer undo facility, presumably because Linux users make more mistakes.

It’s curious that anyone would want to improve in vi, and I say this having used it for over 45 years now. It’s not an editor that I’d choose as a starting point to write a better one. Improved editors are out there, including the straightforward nano editor that is bundled with most GNU/Linux distros; you may find that more friendly if it’s available.

After all this time using vi I certainly know a few tricks and can use it very quickly and efficiently, but the main reason for learning it now is that it will be there when you need to edit something, on every Unix-like system.

FreeBSD error: leapsecond file expired

You might see something odd in the console log about a leapsecond file:

ntpd[905]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): expired 841 days ago

It just means that the leap second file is out of date, and couldn’t be updated automatically. It’s not a big deal as this data rarely changes. In fact the last one was in 2017. You can force an update with the following line:

service ntpd fetch

However, this might well give you another error like “fetch: https://www.ietf.org/timezones/data/leap-seconds.list: Not Found” The reason is that the IETF is not longer hosting the leap-seconds file although it’s baked into the FreeBSD (and likely other) configs.

On FreeBSD you can easily point it to another host at IANA by adding the following to /etc/rc.conf:

ntp_leapfile_sources="https://data.iana.org/time-zones/data/leap-seconds.list"

After you’ve done this you can trigger it manually using the fetch line above. The old source is actually configured originally in defaults/rc.conf, but it’s best to override it in the local rc.conf – and newer releases of FreeBSD have updated it anyway.

Process states in “top”

This applies to FreeBSD, but is similar on Linux.

Both the top and ps utilities will tell you what a given process is doing, which is generally running on a CPU or waiting for something. However, the documentation doesn’t really tell you what these states mean. The man page for the ps utility suggests reading the system source code. (sys/proc.h).

In this post I’ll deal with the common process states in top, the STATE column in the screenshot below.

Other columns are:

  • PID is the process-ID
  • USERNAME the user that the process is running under.
  • THR isn’t documented but I’m very sure it’s the thread count – i.e. the number of threads used by a multi-threaded process.
  • PRI is the current process priority, and NICE is the nice value – an often misunderstood weighting used by the scheduler when determining the current priority. It’s outside the scope of this post.
  • SIZE and RES are the total size of the process and the amount of real RAM currently being used, given it may have allocated memory that hasn’t been used yet or may be paged out.
  • C is the CPU number to which the process is currently assigned
  • TIME is the amount of CPU time (in seconds) the process has used since it was started.
  • WCPU is the percentage CPU time currently being used by the process. Note that if you have four CPUs you can have 400% utilisation, as this applies to a single CPU.

And then, of course, there’s STATUS.

Officially, the status is one of one of “START”, “RUN”, “SLEEP”, “STOP”, “ZOMB”, “WAIT”, “LOCK” or the event being waited for. Run means it’s the currently running process, but SMP systems, RUN will be replaced by CPUn, where n is the CPU number doing the running. You’re unlikely to actually see the others as if a process isn’t running it’s going to be waiting for an event. But this is what they mean:

  • START. A very short-lived state when the process is in the process of being created.
  • SLEEP. The process can’t run as it’s waiting for an event (a character to be typed, a disk operation to complete and suchlike). In top you normally see the event being waited for, and these will be listed later.
  • WAIT. A parent process is waiting for a child process to finish, or more accurately, change state. This means the parent process has called wait(), waitpid(), wait4() or similar (see man 2 wait for a full list).
  • LOCK. The process is waiting until the kernel grants it a lock of some kind. You normally see the lock its waiting for prefixed with a ‘*’ rather than just plain “LOCK”.
  • CPUn. The process is currently running on CPU n on an SMP system.
  • RUN. The process is currently running on the single CPU.
  • STOP. The process has been stopped (suspended) by sending it a SIGSTOP (e.g. by typing Ctrl-Z). It may be restarted using SIGSTART (or running fg/bg).
  • ZOMB. A process has stopped but remains in memory as the parent hasn’t collected its exit status yet. This state doesn’t normally last long unless something’s wrong with the parent. You can’t kill a zombie process (the clue is in the name) so if you have one hanging around it will need a reboot to clear it – but don’t worry too much as it won’t be using much memory or other resources.

As I’ve said, you probably won’t see many of these as a process spends most of its time waiting for an event to happen, and in such cases, it shows the event in question. Common events are:

STATEMeaningReason or system call(s) involved
kqreadWaiting for an event to be posted to a kqueue descriptorkevent() extremely common in modern servers (e.g., nginx, OpenZFS-related daemons, libevent-based apps)
sigwaiWaiting for a signalsigwait(), sigwaitinfo(), sigtimedwait(); used by POSIX signal-handling threads
selectWaiting to read/write file.Legacy select() or pselect() calls, still common but being replaced with kqueue/poll.
nanslpSleeping with nanosecond precisionnanosleep() or clock_nanosleep() used for timers, short sleeps, Rate limiting.
lockfBlocked waiting on an advisory file record lock (byte-range lock)Database or similar waiting to lock part of a shared file. fcntl(…, F_SETLKW, …)
acceptWaiting for incoming TCP connectionClassic blocking accept loop; seen in prefork servers, simple daemons calling accept()
pauseSuspended waiting for any signalUsed by older software (including the shell!) calling pause()
waitWaiting for a child process to change state or end.wait(), waitpid() etc. Very common for parent processes (shells, init-like processes, daemons that fork children)
CPUnActively running on CPU number ‘n’It may mean that the process is in a state that it can be given to a CPU, or it may actually be running.
sbwaitWaiting for socket buffer space (send) or data arrival (receive)Socket I/O wait (e.g., TCP send buffer full or recv waiting)
biord
biow
Blocked on block I/O read / write (disk/network filesystem operations)Waiting for disk I/O completion
piperd
pipewr
Blocked reading or writing to a pipePipe I/O wait. Given pipes are now sockets you don’t see this on BSD any more (or at least, I don’t).
uwaitUserland wait Often related to threading / synchronization primitives like pthread_cond_wait() , sem_wait()

FreeBSD/Linux as Fibre Broadband router Part 3

In parts one and two I covered making the PPP connection, firewall and the DHCP server. This just leaves DNS.

Unbound

FreeBSD has stopped providing a proper DNS server (BIND – the Berkeley Internet Name Daemon) in the base system, replacing it with “unbound”. This might be all you need if you just want to pass DNS queries through to elsewhere and have them cached. It will even allow you to configure your local name server for hosts on the LAN.

To kick off unbound once run “service local_unbound onestart“. This will clobber your /etc/resolv.conf file but it keeps a backup – note well where it’s put it! Probably /var/backups/resolv.conf.20260103.113619 (where the suffix is the date and a random number)

For some strange reason (possibly Linux related) the configuration files for unbound are stored in /var/unbound – notably unbound.conf. By default it will only resolve addresses for localhost, so you’ll need to do a bit of tweaking. Assume your LAN is 192.168.1.0/24 and this host (the gateway/router) is on 192.168.1.2 as per the earlier articles. Add the lines to the server section so it becomes:

server:
        username: unbound
        directory: /var/unbound
        chroot: /var/unbound
        pidfile: /var/run/local_unbound.pid
        auto-trust-anchor-file: /var/unbound/root.key

        interface: 192.168.1.2
        interface: 127.0.0.1
        access-control: 127.0.0.0/8 allow
        access-control: 192.168.1.0/24 allow

        # Paranoid blocking of queries from elsewhere
        access-control: 0.0.0.0/0 refuse
        access-control: ::0/0 refuse

There is a warning at the top of the file that it was auto-generated but it’s safe to edit manually in this case. The interface lines are, as you might expect, the explicit interfaces to listen on. The access-control lines are vital, as listening on an interface doesn’t mean it will respond to queries on that subnet. The paranoid blocking access-control lines are probably redundant unless you make a slip-up in configuring something somewhere else and a query slips in through the back door.

Once configured you can now use 192.168.1.2 as your LAN’s DNS resolver by setting it isc-dhcpd to issue it. A add local_unbound_enable="YES" to your /etc/rc.conf file to have it load on boot.

BIND

Unbound is a lightweight local DNS resolver, but you might want full DNS. I know I do. Therefore you’ll need to install BIND (aka named).

We’re actually looking for BIND9, so search packages for the version you one. This will currently be bind918, bind920 or bind9-devel. Personally I’ll leave someone else to play with the latest version and go for the middle (bind9 version 20).

pkg install bind920

You’ll then need to generate a key to control it using the rndc utility (more on that later)

rndc-confgen -a

Next we’ll need to edit some configuration files:

cd /usr/local/etc/namedb

Here you should find named.conf, which is identical to named.conf.sample in case it’s missing or you break it. The changes are minor.

Around line 20 there’s the listen-on option. Set this to:

listen-on { 127.0.0.1; 192.168.1.2;};

Again, this assumes that 192.168.1.2 is this machine. That’s all you need to do it you want it to provide services to the LAN. While we’re in the options section change the zone file format from modern binary to text. Binary is quicker for massive multi-zone DNS servers, but text is traditional and more convenient otherwise.

masterfile-format text;

If you’re going to do DNS properly you need to configure the local domain. At the the end of the file add the following as appropriate. In this series we’re assuming your domain is example.com and this particular local site is called mysite – i.e. mysite.example.com. All hosts on this site will therefore be named as jim.mysite.example.com, printer.mysite.example.com and so on.

zone "mysite.example.com"
{
        type primary;
        file "/usr/local/etc/namedb/primary/mysite.example.com";
};

zone "1.168.192.in-addr.arpa"
{
        type primary;
        file "/usr/local/etc/namedb/primary/1.168.192.in-addr.arpa";
};

The first file is the zone file, mapping hostnames on to IP addresses. The second is the reverse lookup file. They will look something like this:

; mysite.example.com
;
$TTL 86400      ; 1 day
mysite.example.com        IN SOA  ns0.mysite.example.com. hostmaster.example.com. (
                                2006011238 ; serial
                                18000      ; refresh (5 hours)
                                900        ; retry (15 minutes)
                                604800     ; expire (1 week)
                                36000      ; minimum (10 hours)
                                )
@                       NS      ns0.mysite.example.com.

adderview1              A       192.168.1.204
c5750                   A       192.168.1.201
canoninkjet             A       192.168.1.202
dlinkswitch             A       192.168.1.5
gateway                 A       192.168.1.2
eap245                  A       192.168.1.6
eap265                  A       192.168.1.8
fred-pc                 A       192.168.1.101
ns0                     CNAME       gateway

This is the zone file. I’m not going to explain everything about it here, just that this is a working example and the main points about it.

The first lines, starting with a ‘;’ are comments.

Next comes $TTL, which sets the default time-to-live for everything that doesn’t specify differently, and is basically the number of seconds that systems are supposed to cache the result of a lookup. You might want to reduce this to something like 30 seconds if you’re experimenting. You must specify the default TTL first thing in the file.

Then comes the SOA (Start of Authority) for the domain. It’s specifying the main name server (ns0.mysite.example.com) and the email address of the DNS administrator. However, as ‘@’ has a special meaning in zone files it’s replaced by a dot – so it really reads hostmaster@mysite.example.com. I’ve never figured out how you can have an email address with a dot in the name.

The other values are commented – just use the defaults I’ve given or look them up and tweak them. The only important one is the first number – the serial. This is used to identify which is the newest version of the zone file when it comes to replication, and the important rule is that when you update the master zone file you need to increment it. There’s a convention that you number them YYYYMMDDxx where xx allows for 100 revisions within the day. But it’s only a convention. If you only have one name server, as here, then it’s not important as it’s not replicating.

Next we define the name servers for the domain with NS records. We’ve only got one, so we only have one NS record. The @ is a macro for the current “origin” – i.e. mysite.example.com.

Note well the . on the end of names. This means start at the root – it’s important. Some web browsers allow you to omit it in URLs, and guess you always mean to start at the root – but DNS doesn’t!

Then come the A or Address records. They’re pretty self explanaitory. Because the “origin” is set as mysite.example.com the first line effectively reads:

adderview1.mysite.example.com A 192.168.1.204

This means that if someone looks up adderview1.mysite.example.com they get the IP address 192.168.1.204. Simple! You can have an AAAA record that gives the IPv6 address, but I won’t cover that here.

The last line is line an A record but is a CNAME, which is defining an alias. ns0 is aliased to gateway, which ultimately ends up as being 192.168.1.2 – i.e. the name of our router/DNS server. There is nothing stopping you from having multiple A records pointing to the same IP address – and in some ways it’s better to use an absolute address. It comes down to how you want to manage things, and my desire to get a CNAME example in here somewhere.

The corresponding reverse lookup file goes like this:

; 1.168.192.in-addr.arpa
;
$TTL 86400      ; 1 day
@        IN SOA  ns0.mysite.example.com. hostmaster.example.com. (
                                2006011231 ; serial
                                18000      ; refresh (5 hours)
                                900        ; retry (15 minutes)
                                604800     ; expire (1 week)
                                36000      ; minimum (10 hours)
                                )
@       IN NS      ns0.mysite.example.com.

2       PTR gateway.mysite.example.com.
6       PTR eap245.mysite.example.com.
8       PTR eap265.mysite.example.com.
101     PTR fred-pc.mysite.example.com.
201     PTR c5750m.mysite.example.com.
202     PTR canoninkjet.mysite.example.com.
204     PTR adderview1.mysite.example.com.

As you can see, it’s pretty much the same until you get to the PTR records. These are like A records but go in reverse. In case you’re wondering about the name, it’s important. Note it’s the first three bytes of the subnet but backwards. The last byte is the first part of the PTR line, and the last part is the FQDN to be returned if you do a reverse lookup on the IPv4 address.

Therefore, if you reverse lookup 192.168.1.101 it will look in 1.168.192.in-addr.arpa for a PTR record with 101 as the key and return fred-pc.mysite.example.com.
This all goes back to the history of the Internet, or more precisely, it’s precursor caller ARPAnet. The .arpa TLD was supposed to be temporary during the transition, but it stuck around. Just do it the way I’ve said o or fall flat on your face.

You can have a reverse lookup for IPv6 using a ip6.arpa file, but I’m not going to cover that this time.

Once you’ve made all these changes and set up your zone file, just kick it off with “service named start” (or onestart). To make it start on boot add named_enable=”yes” to /etc/rc.conf

Debugging

You can test it’s working with “host gateway.mysite.example.com 127.0.0.1” and “host gateway.mysite.example.com 192.168.1.2” – both should return 192.168.1.2.

Error messages can be found in /var/log/messages – however they’re not always that revealing! Fortunately BIND comes with some useful checking tools, such a named-checkzone.

named-checkzone mysite.example.com /usr/local/etc/namedb/primary/mysite.example.com

This sanity checks the zone file (second argument) is a proper zone file for the domain name specified in the first argument. We’ve called the file after the domain, which can be confusing but has many advantages in other situations.

You can also check the reverse lookup file in the same way:

named-checkzone 1.168.192.in-addr.arpa /usr/local/etc/namedb/primary/1.168.192.in-addr.arpa

It’ll either come up with warnings or errors, or say it would have been loaded with an OK message.

Next Stage

In Part 2 I explained how to set up the OpenBSD DHCP daemon and here I’ve explained unbound as well as BIND. But for redundancy, the full ISC DHCP Daemon and BIND are necessary as they are able to replicate so one server can carry on if the other fails. That’s the next installment.

Blocking script kiddies with PF

OpenBSD’s PF firewall is brilliant. Not only is it easy to configure with a straightforward syntax, but it’s easy to control on-the-fly.

Supposing we had a script that scanned through log files and picked up the IP address of someone trying random passwords to log in. It’s easy enough to write one. Or we noticed someone trying it while logged in. How can we block them quickly and easily without changing /etc/pf.conf? The answer is a pf table.

You will need to edit pf.conf to declare the table, thus:

# Table to hold abusive IPs
table <abuse> persist

“abuse” is the name of the table, and the <> are important! persist tells pf you want to keep the table even if it’s empty. It DOES NOT persist the table through reboots, or even restarts of the pf service. You can dump and reload the table if you want to, but you probably don’t in this use case.

Next you need to add a line to pf.conf to blacklist anything in this table:

# Block traffic from any IP in the abuse table
block in quick from <abuse> to any

Make sure you add this in the appropriate place in the file (near or at the end).

And that’s it.

To add an IP address (example 1.2.3.4) to the abuse table you need the following:

pfctl -t abuse -T add 1.2.3.4

To list the table use:

pfctl -t abuse -T show

To delete entries or the whole table use one of the following (flush deletes all):

pfctl -t abuse -T delete 1.2.3.4
pfctl -t abuse -T flush

Now I prefer to use a clean interface, and on all systems I implement a “blackhole” command, that takes any number of miscreant IP addresses and blocks them using whatever firewall is available. It’s designed to be used by other scripts as well as on the command line, and allows for a whitelist so you don’t accidentally block yourself! It also logs additions.

#!/bin/sh

/sbin/pfctl -sTables | /usr/bin/grep '^abuse$' >/dev/null || { echo "pf.conf must define an abuse table" >&2 ; exit 1 ; }

whitelistip="44.0 88.12 66.6" # Class B networks that shouldn't be blacklisted

for nasty in "$@"
do
        echo "$nasty" | /usr/bin/grep -E '^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$' >/dev/null || { echo "$nasty is not valid IPv4 address" >&2 ; continue ; }

        classb=$(echo "$nasty" | cut -d . -f 1-2)

        case " $whitelistip " in
                *" $classb "*)
                echo "Whitelisted Class B $nasty"
                continue
                ;;
        esac

        if /sbin/pfctl -t abuse -T add "$nasty"
        then
                echo Added new entry $nasty
                echo "$(date "+%b %e %H:%M:%S") Added $nasty" >>/var/log/blackhole
        fi
done

That’s all there is two it. Obviously my made-up whitelist should be set to something relevant to you.

So how do you feed this blackhole script automatically? It’s up to you, but here are a few examples:

/usr/bin/grep "checkpass failed" /var/log/maillog | /usr/bin/cut -d [ -f3 | /usr/bin/cut -f1 -d ] | /usr/bin/sort -u

This goes through mail log and produces a list of IP addresses where people have used the wrong password to sendmail

/usr/bin/grep "auth failed" /var/log/maillog | /usr/bin/cut -d , -f 4 | /usr/bin/cut -c 6- | /usr/bin/sort -u

The above does the same for dovecot. Beware, these are brutal! In reality I have an additional grep in the chain that detects invalid usernames, as most of the script kiddies are guessing at these and are sure to hit on an invalid one quickly.

Both of these examples produce a list of IP addresses, one per line. You can pipe this output using xargs like this.

findbadlogins | xargs -r blackhole

The -r simply deals with the case where there’s no output, and will therefore not run blackhole – a slight efficiency saving.

If you don’t have pf, the following also works (replace the /sbin/pfctl in the script with it):

/sbin/route -q add $nasty 127.0.0.1 -blackhole 2>/dev/null

This adds the nasty IP address to the routing table and directs packets from it to somewhere the sun don’t shine. pf is probably more efficient that the routing table, but only if you’re using it. This is a quick and dirty way of blocking a single address out-of-the-box.

ZFS or UFS?

I started writing the last post as a discussion of ZFS and UFS and it ended up an explainer about how UFS was viable with gmirror. You need to read it to understand the issues if you want redundant storage. But in simple terms, as to which is better, ZFS is. Except when UFS has the advantage.

UFS had a big problem. If the music stopped (the kernel crashed or the power was cut) the file system was in a huge mess as the data on disk wasn’t updated in the right order as it went along. This file system was also know as FS or FFS (Fast File System) but they were more or less the same thing, and it is now history. UFS2 came along (and JFS2 on AIX), which had journaling so that if there was an abrupt it could probably catch up with itself when the power came back. As with databases, a journal keeps an ordered records of updates you can can apply them to a potentially messed up system later in case they were missed. Now we’re really talking about UFS2 here, which is pretty solid.

Then along comes ZFS, which combines a next generation volume manager and next generation file system in one. In terms of features and redundancy it’s way ahead. Some key advantages are built and very powerful RAID, Copy-on-Write for referential integrity following a problem, snapshots, compression, scalability – the list is long. If you want any of these good features you probably want ZFS. But there are two instances where you might want to stick with UFS2.

Cost

The first problem with ZFS is that all this good stuff comes at a cost. It’s not a huge cost by modern standards – I’ve always reckoned an extra 2Gb of RAM for the cache and suchlike covers the resource and performance issues . But on a very small system, 2Gb of RAM is significant.

The second problem is more nuanced. Copy-on-Write. Basically, in order to get the referential integrity and snapshots, when you change the contents of a block within a file ZFS it doesn’t overwrite the block with new data. It writes a new block in free space and links to that instead. If the old block isn’t needed as part of a snapshot it will be marked as free space afterwards. This means that if there’s a failure while the block is half written, no problem – the old block is there and the write never happened. Reboot and you’re at the last consistent state, no more than five seconds before some idiot dug up the power cable.

Holy CoW!

So Copy-on-Write makes sense in many ways, but as you can imagine, if you’re changing small bits of a large random access file, that file is going to end seriously fragmented. And there’s no way to defragment it. This is exactly what a database engine does to its files. Database engines enforce their own referential integrity using synchronous writes, so they’re going to be consistent anyway – but if you’re insisting all transactions in a group are written in order, synchronously, and the underlying file system is spattering blocks all over the disk before returning, you’ve got a double whammy – fragmentation and slow write performance. You can put a lot of cache in to try and hide the problem, but you can’t cache a write if the database insists it won’t proceed until it’s actually stored on disk.

In this one use case, UFS2 is a clear winner. It also doesn’t degrade so badly as the disk becomes full. (The ZFS answer is that if the zpool is approaching 80% capacity, add more disks).

Best of Both

There is absolutely nothing stopping you having ZFS and UFS2 on the same system – on the same drives even. Just create a partition for your database, format it using makefs and mount it on the otherwise ZFS tree wherever it’s needed. You probably want it mirrored for redundancy, so use gmirror. You won’t be able to snapshot it, or otherwise back it up while it’s running, but you can dump a database to a ZFS dataset and have that replicated along with all the others.

You can also boot off UFS2 and create a zpool on additional drives or partitions if you prefer, mounting them on the UFS tree. Before FreeBSD 10 had full support for booting direct of ZFS this was the normal way of using it. The advantages of having the OS on ZFS (easy backup, snapshot and restore) mean it’s probably preferable to use it for the root now, and mount any UFS2 file systems in directories off it.

UFS, gmirror and GPT drives

Spot the deliberate mistake

Over eight ago now I wrote a post ZFS is not always the answer. Bring back gmirror!, suggesting that writing off UFS in favour of ZFS wasn’t a clear cut decision and reminding people how gmirror could be used to mirror drives is you needed redundancy. It’s still true, but it probably needs an update as things are done a little differently now.

MBR vs GPT

There have been various disk partition formats over the years. The original PDP-11 Unix contained only a boot block (512b) to kick start the OS, but BSD implemented its own partitioning scheme from 386BSD onwards – 8K long consisting of a tiny boot1 section that was just enough to find boot2 in the same slice, which was then able to read UFS and therefore the kernel. This first appeared 4.2BSD on the VAX.

Then from the early 1990s the “standard” hard disk partition scheme from the MS-DOS Master Boot Record (MBR) seemed like a great idea. Slices got replaced by partitions and you could co-exist with other systems on the same drive; and x86 systems were now really common, especially compared for VAXes.

The so-called MBR scheme had its problems (and workarounds) as Microsoft wasn’t exactly thinking ahead, but these have been fixed thanks to the wonderful GPT scheme, which was actually designed. However, GEOM Mirror and UFS predate GPT adoption and you have to be aware of a few things if you’re going to use them together. And you should be using GPT.

Why should you use GPT just because it’s “new”? Not so new, in fact. It was actually dreamt up more than 25 years by Intel (on the IA-64 I believe). GPT has a backup header so if you lose the first blocks on your drive you’re not dead in the water – a favourite trick with DOS/Windows losing the entire drive for the sake of one sector. GPT allows drives to be more than 2Tb because it has 64-bit logical block addresses. If that’s not enough, it identifies partitions with a UUID so you can move them around physically and the OS can still find them rather than always having to hang them of the same controller port. And if you’re mixing operating systems on the same disk the others are likely to be using GPT too, so they’ll play nice. As long as you have UEFI compatible firmware, you’re good to go. If all your drives are <2TB and you have old firmware, and only want to run FreeBSD, stick to MBR – and keep a backup of the boot block on a floppy just in case.

Gmirror and GPT

As I mentioned, GPT keeps a second copy of the partition information on the disk. In fact it stores a copy at the end of the drive, and if the table at the front is corrupt or unreadable it’ll use that instead. Specifically GPT stores a header in LBA 1 and the partition table in LBA 2-33 (an insanely large partition table but Intel didn’t want to be accused of making the same limiting mistakes as Microsoft).

The backup GPT header is on on the last block of the drive, with the backup partition table going backwards from that (for 33 LBAs).

GMirror, meanwhile, stores its metadata on the last 512-byte sector of the drive. CRUNCH.

So what to do? One method is to use the -h switch when setting up with gmirror:

gmirror label -h m0 da0 da1

This moves the metadata to the front of the disk, which will deconflict it with the GPT header okay but might crunch with other bootloaders, particularly from another OS that’s sharing the same disk, and which we have no control of. I say might. Personally, I wouldn’t be inclined to take the risk unless I’m dedicating the drive to FreeBSD.

The safe method is to NOT mirror the entire disk, only the partitions we’re interested in. Conventionally, and in the 2017 post, you mirrored the entire drive and therefore the drives were functionally identical without any further work. The downside was that if you replaced a drive you needed one exactly the same size (or larger), and not all 500Gb drives are the same number of blocks, although there’s a pretty good chances these days. If you did happen to be a block or two short on the new one you’d be out of luck.

GEOMs and disks?

I’ve explained how to mirror a single partition already, but not gone into the technicalities. If you’re new to FreeBSD you might not have cottoned on what a GEOM is. It’s short for “geometry”, which probably doesn’t help with understanding it one bit.

It gets the name from disk geometry, but don’t worry about the name. It’s an abstraction layer added to FreeBSD between the physical drive (provider) and higher level functions of the OS such as filing systems (consumers). You can add GEOM “classes” between the provider and consumer to provide RAID, mirroring, encryption, journaling, volume management and suchlike. Before ZFS, this was how you got fancy stuff done. Now, not so much. But the GEOM mirror class (aka gmirror) is still very useful indeed.

But the bottom line is that a disk partition can be a “provider” in just the same way as the whole disk, so what works for a disk will also work for a partition. Chances are the installer has partitioned up your drive thus:

=>        40  5860533088  ada0  GPT  (2.7T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048     4194304     2  freebsd-swap  (2.0G)
     4196352  5856335872     3  freebsd-ufs (2.7T)
  5860532224         904        - free -  (452K)

This means /dev/ada0p3 is the UFS partition we’re interested in mirroring. Believe it or not, partition numbers start at one, not zero!

How to actually do it

So if you’ve installed your system and now want to add a GEOM mirror, proceed as follows. Let’s assume your second drive is ada1, which would be logical.

You’ll have to partition it so it has at least one partition the same size as the one you want to mirror. Chances are you’ll want all partitions common between drives. The quickest way to achieve this is to simply copy the partition table:

gpart backup ada0 | gpart restore -F ada1

You can sanity check this with gpart show ada1, which should output the same as gpart show ada0.

Load the geom_mirror module

kldload geom_mirror
echo 'geom_mirror_load="YES"' >> /boot/loader.conf

The second line adds it to loader.conf to make it load on boot, but only do it if it’s not there already. The kldload will complain if it’s already loaded, which is a good clue you don’t need the second line.

Create the mirror

gmirror label ufsroot /dev/ada0p3 /dev/ada1p3

The “label” subcommand simply writes the metadata to the disks or partitions – remember disks or partitions are all the same to GEOM. The name “ufsroot” is chosen by me to be meaningful. Manuals use things like gm0 for GEOM mirrors and people have come to think it’s important they’re named this way, when the opposite is true. You already know it’s a GEOM mirror because the device is in /dev/mirror – it’s more helpful to know what it’s used for, e.g. UFS root, or swap, or var or whatever.

You can, while you’re at it, mirror as many partitions as you wish if you have separate ones for other purposes. You can even mirror a zfs partition without ZFS knowing you’re doing it if you’re crazy enough. Mirroring the swap partitions is something you should definitely consider.

You can check it’s worked with gmirror status, which should output something like this:

  Name         Status   Components
mirror/ufsroot COMPLETE ada0p3 (ACTIVE)
                        ada1p3 (SYNCHRONIZING)

Wait until it’s finished synchronising, which will take a long time on a large disk. Perhaps go to bed.


Mount the mirror

This process will have created a new device called /dev/mirror/ufsroot but you still have to mount it in place of the “old” UFS partition. This is controlled in the normal way by /etc/fstab, so make a backup and fire up your favourite editor.

Look for the entry for /dev/ada0p3 and change it to /dev/mirror/ufsroot:

/dev/mirror/ufsroot / ufs rw 1 1

Reboot and you should be good.

Boot code

Although your UFS partition is mirrored, if ada0 fails, the system won’t boot as it stands as ada1 lacks the boot code. You can add this this easily enough:

gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 ada1

Finally, what about swap partitions? For robustness, mirror them too in the same way:

gmirror label swap /dev/ada0p2 /dev/ada1p2

Then edit fstab to swap on /dev/mirror/swap. Remember “swap” is a meaningful name chosen by you!

Alternatively you can edit fstab to swap on ada1p2 as well, which spreads the load (best for performance). Or you can just leave it as it is – if ada0 fails and you reboot you’ll have no swap until you fix it, but you’ll probably be worrying about other things if that happens.

FreeBSD/Linux as Fibre Broadband router Part 2

In part one I described how to set up PPP and the pf firewall to provide NAT with port forwarding and other good things. In Part 2 I’ll add DCHP, and as a bonus I’ll add configuration for an IP address blockfor if you have that kind of ISP. If you want that kind of ISP but can’t find one, I can point at a few that do. In Part 3 I’ll cover DNS and BIND.

DHCP

There’s never been a DHCP server in the FreeBSD base, but it’s installed easily by compiling the port or installing the package. Your best bet for FreeBSD is the DHCP daemon written for OpenBSD, AKA the ISC dhcpd. But beware – the OpenBSD one, although called version 6.6, lags behind the other package isc-dhcp44 as it doesn’t have support peer servers. If you’ve only got one DHCP server on your network, it’s fine. If you want to have primary and secondary servers, or load balance them, look at the latest ISC one instead. I’ll deal with that in another post.

pkg install dhcpd

Before you kick it off you really ought to edit the configuration file, /usr/local/etc/dhcpd.conf. There’s usually a second copy of it postfixed with .sample, and it’s pretty self documenting. I’m posting the basics from a real configuration, which I shall annotated to death. But first, something about the network we’re defining:

I’m going to have a LAN with 192.168.1.0/24 – which means IP addresses in the range 192.168.1.1 to 192.168.1.254. This isn’t a tutorial on routing – just leave the first and last address (0 and 255) alone for now. The network will have a domain. This is optional, but if you’re doing your own DNS you’ll want one. You don’t have to register this domain externally – you can make it up (please end it in .local!) – but let’s assume you have a real one: “example.com”. You’ve created an subdomain for this site called mysite.example.com and it has an A record to prove it, and you’ll probably want to delegate the DNS to it later. But if you’re not worried about domain names, don’t worry about all of this.

The router (i.e. the FreeBSD box) is going to be on 192.168.1.2, which is set up in rc.conf. It can’t be assigned automatically by DHCP because, well, we’re also the DHCP server and that would be silly.

Assuming your LAN-side network interface is bge0 (remember the modem is on bge1 in Part 1) the following line would do it:

sysrc ifconfig_bge0="inet 192.168.1.2 netmask 255.255.255.0"

Obviously change bge0 to the name of your actual Ethernet interface! You might wonder why I’m putting the router on 192.168.1.2 instead of 192.168.1.1, which is a common convention. It’s simple: There are so many home user network appliances that come with 192.168.1.1 as their default IP address, and if you plug one in to your LAN the clash will cause merry hell before you’ve been able to go to their web interface to configure it to something else.

I want some devices to have a fixed IP address supplied by DHCP, and other things to have dynamically allocated ones – friends using the guest WiFi, for example. Having network infrastructure like switches and WAPs on a static addressed, defined by DHCP, is a good way to go. Connecting network printers to Windoze is smoother if they’re on a fixed IP too. But going around and setting it on each device is a pain, so do it by DHCP where it’s defined in one place and can be managed in one place. It works by recognising the MAC address in the request and giving back whatever IP address you have chosen.

As a final tip, keep your network address plan as comments in dhcpd.conf – it’s where you want the information anyway. And with that, here’s the sample file:


# This is the domain name that will be supplied to everything on
# the LAN by default. This is the domain that will be searched if you
# enter a host name. For example, if you want to connect to "fred-pc" it
# will look for it as fred-pc.mysite.example.com, which if you have
# your DNS set up correctly, will find it quickly.
option domain-name "mysite.example.com";

# This specifies the DNS server(s) the machines on the LAN
# will use. We're specifying the same as the router, because
# we'll be running DNS there. If you don't want to, just use the
# IP address of DNS server supplied by your ISP.
option domain-name-servers 192.168.1.2;

# These just specify the time a machine on the LAN gets to hold
# on to a dynamic address before it needs to renew it.
default-lease-time 43200;
max-lease-time 86400;

# This defines our pool of dynamically allocated addresses,
# and I've chosen the range 100..199. Options here override the
# options above (outside the {...}) in the way you might expect.
# I've set the default lease time to 900 seconds (15 minutes)
# for testing purposes only. 2h is normal but it's up to you.
# I normally go for 12h.

subnet 192.168.1.0 netmask 255.255.255.0 {
  range 192.168.1.100 192.168.1.199;
  option broadcast-address 192.168.1.255;
  default-lease-time 900;
}

# The next block is assigning a fixed IP address to 
# a switch, because I don't want it to move. This just needs the 
# MAC address of the device and the fixed-address you want to give it.
# You can have as many of these as you like. The name "switch1" is really
# just for your own reference.

host switch1 {
        hardware ethernet 00:02:FC:CB:1E:7D;
        fixed-address 192.168.1.3;
}

For more information see this post about assigning names, and the dhcpd.conf.sample, which has scenarios far more complex than you’ll need on a simple LAN.

Enable it on reboot with:

sysrc dchpd_enable=yes

You can then start it manually with service dhcpd start.

If you want to make changes to dhcpd.conf you can at any time, but they won’t take effect until you restart dhcpd (with service dhcpd restart). There’s no way of having it just do a reload. Details of the leases it has issued are /var/db/dhcpd.leases, which is just a text file and you can easily read it.

Routing a whole subnet

Supposing you have more than one IP address coming down the PPPoE tunnel at you? This is a service you can get from your ISP, giving you multiple IP addresses for various purposes – such as running servers. Other ISPs give you a single dynamic address, or worse, an IP address generated by CG-NAT. I’d argue this ceases to meet the definition of “Internet Service” at this point.

But assuming you have a block of static addresses, how do you get ppp to use them? I haven’t seen this documented ANYWHERE and figuring it out involved a great deal of trial and error. Shout out to shurik for encouraging me to keep going where ppp.linkup was concerned.

The easy way to add an alias to your tunnel (which you’ll recall we called wan0) is to use ifconfig and simply add it. But the trick with tunnels is to add the alias IP address and the remote tunnel address (i.e. HISADDR). You can find out what HISADDR is using ifconfig:

# ifconfig wan0
wan0: flags=1008051<UP,POINTOPOINT,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 1492
        options=80000<LINKSTATE>
        inet 1.2.3.4 --> 44.33.22.11 netmask 0xffffffff
        groups: tun
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 658

In the output above, 1.2.3.4 is the IP address supplied by LCP – i.e. your public IP address. 44.33.22.11 is the IP address of the other end of the tunnel. In the parlance of the PPP utility, HISADDR. Earlier we set the default route to HISADDR. There are good reasons why HISADDR is dynamic, not least of which is having a pool of gateways for redundancy, so you have to check what it actually IS today before you assign an alias public address to the tunnel.

Then it’s a simple matter of adding further addresses using ifconfig:

ifconfig wan0 alias 1.2.3.41/32 44.33.22.11

Yes, it’s not quote the same format as adding an alias to an Ethernet interface, as the remote address follows the local one.

You can write a little script to do them automatically:

#!/bin/sh
HISADDR=$(ifconfig wan0 | grep "inet 1.2.3.4" | cut -w -f 5)
ALIASES="1.2.3.41 1.2.3.42 1.2.3.43 1.2.3.44"
for a in $ALIASES
do
    ifconfig wan0 delete $a
    ifconfig wan0 alias $a/32 $HISADDR
done

Note that I’m using grep to find the correct inet address based on the static address I know the interface has. Fiddle this to suit your static address, or if you don’t have one, grep for inet and hope the first it finds is correct. I’m also deleting the old aliases as they might need to be recreated using the new HISADDR.

This is all well and good, but when do you run the script? Automating it is the trick. Fortunately there’s a hook in ppp, where it processes the file /etc/ppp/ppp.linkup when the link comes up. As far as I can tell it’s the same format as ppp.conf, and you have to label the service name in the same way. What’s not documented is how you add alias addresses, but I’ve found a way by getting it to run ipconfig for you. If you start a line with ” !bg “, what follows is run. It’s run without an environment so you have to specify all paths to whatever you want to run in full, but it does work and does expand macros like HISADDR. The space in front of the ! is important! Incidentally, there’s also a ppp.linedown.

Here’s my /etc/ppp/ppp.linkup

cloudscape:
  !bg /sbin/ifconfig wan0 alias 1.2.3.40/32 HISADDR
  !bg /sbin/ifconfig wan0 alias 1.2.3.41/32 HISADDR
  !bg /sbin/ifconfig wan0 alias 1.2.3.42/32 HISADDR
  !bg /sbin/ifconfig wan0 alias 1.2.3.43/32 HISADDR
  !bg /usr/sbin/service sshd restart

I would very much like to find the documentation for this, but the author (Brian Somer) has moved on to other things and the documentation that’s out there appears to be all there is. It was written for dial-up connections and wasn’t really designed for fixed lines with multiple public IP addresses.

Note the final line ” !bg /usr/sbin/service sshd restart” – this restarts sshd once the WAN interface is established, otherwise it wouldn’t be able to bind to it. If you don’t want public-facing ssh then you should alter the ListenAddress lines in /etc/ssh/sshd_config to exclude these addresses and the restart line becomes unnecessary.

Meanwhile the other PPP demon, mpd5, which is supposed to be better, was listed in the FreeBSD Handbook as being for PPPoA, pushing user-ppp for PPPoE. This isn’t actually the case, and I may be revisiting this using mpd5 at some point because it’s faster and more efficient, and I don’t need all the extra wonderful NAT and firewall features of user-ppp.

ISC-DHCPd

In this article I’ve used the standard OpenBSD version of the ISC DHCP server. It’s not the same as the full one, which also handle replication. In Part 4 I’ll cover DNS and DHCP replication and redundancy but as I haven’t written it you might want to install the package isc-dhcp44-server instead and use that. For a single server configuration it’s basically the same, but it gets you ahead of the game if you want to replicate servers.

LetsEncrypt, acme.sh and Apachectl reloads

This morning I woke up to an expired TLS certificate on this blog. This is odd, as it’s automatically renewed from LetsEncrypt using acme.sh, kicked off by a cron job. So what went wrong?

I don’t write about LetsEncrypt or ACME much as I don’t understand everything about it, and it keeps surprising me. But I had discovered a problem with FreeBSD running the latest Apache 2.4 in a jail. As I run my web servers in jails, this applies to me.

I like acme.sh. It’s a shell script. Very clever. No dependencies. Dependencies are against my religion. Why anyone would use a more complex system when there’s something simple that works?

For convenience reasons the certificates are renewed outside of a jail, and the sites are created using a script that sets it all up for me. One source of certificates for multiple jails; it’s easier to manage. It manages sites on other hosts using a simple NFS mount.

When you use acme.sh to renew a certificate for Apache you need to be able to plonk something on the web site. This is easy enough – the certificate host (above the jails) can either get direct access through the filing system, or via NFS. It then gets the new certificate and copies it into the right place. When you first issue yourself a certificate you specify the path you want the certificate to go, and the path to the web site. You also specify the command needed to get your web server to reload. It magically remembers this stuff so the cron job just goes along and does them all. But that’s where the fun starts.

I rehosted the blog on a new instance of Apache, and created a new temporary website to make sure SSL worked – getting acme.sh to issue it a certificate in the process. All good, except I noticed that inside a jail, the new version of Apache stops but doesn’t restart after an “apachectl graceful”. The same with “apachectl reload”. Not great, but I tried using “service -j whatever apache24 restart”. A bit drastic but it worked, and I’ve yet to figure out why other methods like “jexec whatever apachectl graceful” stall.

So what happened this morning at 6am? There were some certificates to renew and acme.sh –cron accidentally KOed Apache. It’s the first time any had expired.

Running acme.sh manually between restarting Apache manually worked, but it’s hardly the dream of automation promised by Unix. Debugging the script I found it was issuing a graceful restart command, and I thought I’d specified something more emphatic. So I started grepping for the line in was using, assuming it must be in a config file somewhere. Nothing.

Long story short, I eventually found where it had hidden the command: in .acme.sh/domain.name/domain.name.conf , in spite of having looked there already. It turns out that it’s the line “Le_ReloadCmd=”, and its unique for each domain (sensible idea), but it’s base64 encoded instead of being plain text! And it’s wrapped between “_ACME_BASE64__START_” and “_ACME_BASE64__END_”. I assume this is done to avoid difficulties with certain characters in shell scripts but it’s a bit of a pain to edit it. You can create a new command by piping it through base64 and editing very carefully, but readable it ain’t.

There is an another way – just recopy the certificate. Unfortunately you need to know, and use, the same options as when you originally created it – you can’t just issue a different –reloadcmd. You can check these by looking at the domain.name.conf file, where fortunately these are stored in plain text. Assuming they’re all the same, this little script will do them all for you at once. Adjust as required.

#!/bin/sh

# Make sure you're in the right directory
cd ~/.acme.sh

# Jail containing web site, assumed all the same.
WJAIL=web

for DOM in $(find . -type d -depth 1 | sed "s|^\./||")
do

echo acme.sh $TEST -d $DOM  --install-cert \
        --cert-file /jail/$WJAIL/data/certs/$DOM/cert.pem \
        --key-file /jail/$WJAIL/data/certs/$DOM/cert.key \
        --fullchain-file /jail/$WJAIL/data/certs/$DOM/Fullchain.pem \
        --reloadcmd "service -j $WJAIL apache24 restart"
done

You will notice that this only echos the command needed, so if anyone’s crazy enough to copy/paste it then it won’t do any damage. Remove the “echo” when you’re satisfied it’s doing the right thing for you.

Or you could just edit all the conf files and replace the Le_ReloadCmd= line – you only have to generate it once, after all.

Unix chmod and chown

Following on from Basic UNIX file commands, here’s a bit there wasn’t time for on changing metadata on files.

These two commands change the files permissions and ownership. Permissions is information associated with a file that decides who can do what with it. This was once called the file mode, which is why the command is chmod (Change MODe). Files also have owners, both individuals and group of users, and the command to change this is chown (CHOWNer).

chown is easiest, so I’ll start there. To make a file belong to fred the command is:

chown fred myfile

To change the owning group to accounts:

chown :accounts myfile

And to change both at once:

chown fred:accounts myfile

Changing a files permissions is more tricky, and there are several ways of doing it, but this is probably the easiest to remember. You’ll recall that each file has three sets of permissions: Owner, Group and Other. The permissions themselves are read, write and execute (i.e. it’s an executable program).

chown can set or clear a load of permissions in one go, and the format is basically the type of permission, and ‘+’ or ‘-’ for set or clear, and the permissions themselves. What? It’s probably easier to explain with a load of examples.:

chmod u+w myfile

Allows the user of the file to write to it (u means user/owner)

chmod g+w myfile

Allows any user in the group the file belongs to write to it.

chmod o+r myfile

Allows any user who is not in the files group or the owner to read it. (o means “other”).

You can combine these options

chmod ug+rw

Allows the owner and the group to read and write the file.
chmod go-w

Prevents anyone but the user from being able to modify the file.

If you want to run a program you’ve just written called myprog.

chmod +x myprog

If you don’t specify anything before the +/- chmod assumes you mean yourself.

You might notice an ‘x’ permission on a directory – in this case it means the directory is searchable to whoever has the permission.