Large swap files on FreeBSD die with mystery “Killed” – howto add lots of swap space

Adding extra swap space to FreeBSD is easy, right? Just find a spare block storage device and run swapon with its name as an argument. I’ll put a step-by-step on how you actually do this at the end of the post in case this is news to you.

However, I’ve just found a very interesting gotcha, which could bite anyone running a 64-bit kernel and 8Gb+ of RAM.

From here we’re getting into the FreeBSD kernel – if you just want to know how to set up a lot of swap space, skip to the end…

I’ve been running a program to process a very large XML file into a large binary file – distilling 100Gb of XML into 1Gb of binary. This is the excuse for needing 16Gb of working storage (please excuse my 1970’s computer science terminology, but it’s a lot more precise than the modern “memory” and it makes a difference here).

I was using 2Gb of core and 8Gb of swap space, but this was too little so I added an extra 32Gb of swap file. Problem sorted? Well top and vmstat both reported 40Gb of swap space available so it looked good. However, on running the code it bombed out at random, with an enigmatic message “Killed” on the user console. Putting trace lines in the code narrowed it down to a random point while traversing a large array of pointers to pointers to the 15Gb heap, about an hour into the run. It looked for all the world like pointer corruption causing a Segmentation Fault or Bus Error, but if the process had a got that kind of signal it should have done a core dump, and it wasn’t happening. The output suggested a SIGKILL. But it wasn’t me sending it, and there were no other users logged in. Even a stack space error, which might have happened as qsort() was involved, was ruled out as the cause – and the kernel would have sent an ABORT, not a KILL in this case.

I finally tracked it down to a rather interesting “undocumented” feature. Within the kernel there is a structure called swblock in the John Dyson/Matthew Dillon VM handler, and a pointer called “swap” points to a chain of these these structures. Its size is limited by the value of kern.maxswzone, which you can tweak in /boot/loader.conf. The default (AMD64 8.2-Release) allows for about 14Gb of swap space, but because it’s a radix tree you’ll probably get a headache if you try to work it out directly. However, if you increase the swap space beyond this it’ll report as being there, but when when you try to use the excess, crunch!

Although this variable is tunable, it’s also hard-limited in include/param.h to 32M entries; each entry can manage 16 pages (if I’ve understood the code correctly). If you want to see exactly what’s happening, look at vm/swap_pager.c.

The hard limit to the size number of swblock entries is set as VM_SWZONE_SIZE_MAX in include/param.h. I have no idea why, and I haven’t yet tried messing with it as I have no need.

So, what was happening to my process? Well it was being killed by vm_pageout_oom() in vm/vm_pageout.c. This gets called when swap space OR the swblock space is exhausted, either in vm_pageout.c or swap_pager.c. In some circumstances it prints “swap zone exhausted, increase kern.maxswzone\n” beforehand, but not always. It’s effect is to find the largest running non-system process on the system and shoot it using killproc().

Mystery solved.

So, here’s how to set up 32Gb of USABLE swap space.

First, find your swap device. You can have as many as you want. This is either a disk slice available in /dev or, if you want to swap to a file, you need ramdisk to do the mapping. You can have as many swap devices as you like and FreeBSD will balance their use.

If you can’t easily add another drive, you’re best option is to add an extra swap file in the form of a ram disk on the existing filing system. You’ll need to be the root user for this.

To create a ram disk you’ll need a file to back it. The easy way to create one is using “dd”:

dd if=/dev/zero of=/var/swap0 bs=1G count=32

This creates a 32G file in /var filled with nulls – adjust as required.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

It’s probably a good idea to make this file inaccessible to anyone other than root:

chmod 0600 /var/swap0

Next, create your temporary file-backed RAM disk:

mdconfig -a -t vnode -f /var/swap0 -u 0

This will create a device called /dev/md with a unit number specified by -u; in this case md0. The final step is to tell the system about it:

swapon /dev/md0

If you wish, you can make this permanent by adding the following to /etc/rc.conf:

swapfile="/var/swap0"

Now here’s the trick – if your total swap space is greater than 14Gb (as of FreeBSD 8.2) you’ll need to increase the value of kern.maxswzone in /boot/loader.conf. To check the current value use:

sysctl kern.maxswzone

The default output is:

kern.maxswzone: 33554432

That’s 0x2000000 32M. For 32Gb of VM I’m pretty sure you’d be okay with 0x5000000 (in round numbers), which translates to 83886080, so add this line to /boot/loader.conf (create the file if it doesn’t exist) and reboot.

kern.maxswzone="83886080"

 

Spamassassin, spamd, FreeBSD and “autolearn: unavailable”

I recently built a mail server using FreeBSD 8.2 and compiled spamassassin from the current ports collection, to run globally. spamd looked okay and it was adding headers, but after a while I noticed the Baysian filtering didn’t seem to be working in spite of it having had enough samples through.

A closer look at the added headers showed “autolearn: no”, or “autolearn: unavailable” but never “autolearn: ham/spam”.

What was going on? RTFM and you’ll see spamassassin 3.0 and onwards has added three new autolearn return codes: disabled, failed and unavailable. The first two are pretty self-explanatory: either you’d set bayes_auto_learn 0 in the config file or there was some kind of error thrown up by the script. But I was getting the last one:

unavailable: autolearning not completed for any reason not covered above. It could be the message was already learned.

I knew perfectly well that the messages hadn’t already been learned, so was left with “any reason not covered by the above”. Unfortunately “the above” seemed to cover all bases already. There wasn’t any clue in /var/maillog or anywhere else likely.

I don’t much care for perl scripts, especially those that don’t work, so after an unpleasant rummage I discovered the problem. Simply put, it couldn’t access its database due to file permissions.

The files you need to sort are at /root/.spamassassin/bayes_* – only root will be able to write to them, not spamd – so a chmod is in order.

A better solution is to move the Bayesian database out of /root – /var would probably be more appropriate. You can achieve this by adding something like this to /etc/spamd.cf (which should link to /usr/local/etc/mail/spamassassin/local.cf):

bayes_path /var/spamassassin/bayes/bayes
bayes_file_mode 0666

I suspect that the lower-security Linux implementation avoids these problems by setting group-write-access as default, but FreeBSD, being a server OS, doesn’t. It’s also a bug in the error handling for the milter – it should clearly report as a “failed” and write something to the log file to tell you why.

You should be able to restart spamd after the edit with /usr/local/sbin/spamdreload, but to be on the safe side I use the following after shutting down Sendmail first.

/usr/local/etc/rc.d/spamass-milter restart
/usr/local/etc/rc.d/sa-spamd/restart

I don’t know if Sendmail can cope well with having spamass-milter unavailable, but why take the risk?

 

OpenLDAP, Thunderbird and roving address books

IMAP is great. It lets you keep your mail synchronised between any number of machines, including webmail, and everything works just fine. The only snag is that your address book isn’t included. I’d always assumed this was what LDAP was for: a centralised directory of names, and other things, with the useful bit being the address book. Thunderbird, is my current favourite mail client on the basis that actaully works better than Outlook. It supports LDAP address books, and has offered to configure one for me many times. All I needed to do was configure slapd (the OpenLDAP server deamon) and point Thunderbird at it.

This blog entry isn’t a tutorial in configuring FreeBSD, OpenLDAP and Thunderbird to work together. I’m saving you from wasting a lot of your time trying. It does “work”, once you’ve sorted out schemas and got to grips with the arcane syntax of the configuration files and the hierarchical nature of the thing. It’s just that it’s useless even when it’s working because it’s READ-ONLY. Being able to add and amend entries in my address book is so fundamental to the nature of an address book that I didn’t bother to check that Thunderbird could do it. What’s the use of a read-only address book? Well there might be some point in a large organisation where a company-wide address book is needed, administered by a tame geek in the basement. For the rest of us it’s as fileofax with no pen.

So what are the good people at Mozilla playing at? The omission of read/write has been listed in their bug database for over ten years, and no one has tackled it. I thought about it for a while, but given the that Lightweight-DAP is a misnomer on a spectacular scale I thought again. Clearly no one who knows about LDAP actually likes it enough to want to help; either that or none actually understands it apart from the aforementioned geek in the basement, and he’s sitting tight because allowing users to edit address books might be detrimental to his pizza supply.

The time is right for a genuinely lightweight protocol for sharing address books in a sane and sensible manner; something like IMAP for addresses. I’m therefore writing one. Unfortunately I’m not so clued up on Thunderbird’s internal workings; if you are and wish to implement the front end please drop me a line and I’ll write a protocol and server that works.

Unfortunately this one issue is a killer app for Microsoft’s lightweight over-priced Mail system called Exchange. It’s a big of a dog (inflexible) but at least Microsoft seems to have got this fundamental functionality for sharing personal address books between mail clients sorted out. I believe it uses something similar LDAP underneath (along with IMAP for the mail itself); so it’s not impossible.

I’m very surprised to find myself having anything good to say about Outlook/Exchange Server. It might still be traumatised from the discovery that my assumption that the obvious LDAP solution was nothing of the sort. It’s just it’s so damn complex for no apparent reason that it gives the impression it must be great if you could only understand it.

WordPress ends up with wrong upload directory after moving servers

If you’ve moved WordPress from one server (or home directory) to another, everything looks fine until you come to upload a file. It then complains that it can’t write to the upload directory. Something like:


Unable to create directory /usr/home/username/wp-content/uploads/xxx/yy. Is its parent directory writable by the server?

A search through the PHP config files doesn’t reveal a path to the upload directory, and if it’s possible to change it from the Dashboard I’ve yet to find where.

The only remaining option was the mySQL database, and fishing around in the likely sounding wp_option table I found a option_name of “upload_path”. A quick query showed this was the home directory.

To save you fishing, here’s the SQL needed to set things right. Obviously select the appropriate database with “use” first!


update wp_options
set option_value='<path-to-your-wordpress-installation>/wp-content/uploads'
where wp_options.option_name='upload_path';

This is how you do it using straight SQL from the command line. If you’re using some sort of restricted web hostinig account “See your system administrator” and point them here.

Apps to force Web into decline?

Who’s going to win the format war – iOS (Apple iPad) or Android? “What format war?” you may ask. Come on, it’s obvious. Some are saying that the web is either dying (dramatic) or at the least being impacted by the modern fashion of Apps, and these run on iOS or Android (mostly). Actually, by sales Apple is winning hands-down.

This IS a format war, because developers need to support one or other platform – or both – and users need to choose the platform that has the content they need, and there is some sense in it when databases contents are queried and displayed in Apps rather than on web pages.
Apple has the early advantage, and the cool factor. But it’s the most expensive and the most hassle to develop for, as Apps can only be sold through Apple. Android is a free-for-all. Apps can be sold through Google, or anyone else making them available for download in the future. It’s an open standard. The security implications of this are profoundly worrying, but this is another story.

So, running iOS is expensive, Android is insecure and neither are very compatible. That’s before you consider Blackberry and any requirement to run an App on your Windows or Linux PC.

But, I don’t think this is a conventional format war. It’s mostly software based, and open standards software might just win out here (and I don’t mean Android). People like paying for and downloading Apps. Web browsers can (technically) support Apps, using Java and the upcoming HTML5 in particular. Why target a specific operating environment when you can target a standard web browser and run on anything?

As an aside, HTML5 is sometimes hailed as something new and different when in fact it’s just evolution and tidying up. The fact is that HTML is cross-platform and will deliver the same functionallity as Apps. HTML5 simply standardises and simplifies things, making cross-platform more open-standard, so every browser will be able to view page content without proprietary plug-ins, including better support for mobile devices which lost out in the late 1990’s onwards when graphic designers decided HTML was a WYSIWYG language.

Some modern-day pundits will proclaim that data will be accessed more through Apps in the future, and the web has had its decade. Apparently a third of the UK is now using smart-phones. Whether this statistic is correct or not, they’re certainly popular and I’ll concede that Apps are here to stay. But in my vision of the future they won’t be running on iOS, Android or Blackberry – they’ll be written using HTML5 and run on anything. It’s platform independence that launched HTML and the web twenty years ago, and it’s what will see off the competition for the next twenty years.

Phone hacking gets serious

A committee of MPs are currently grilling the management of News International trying to find someone to blame for the ‘phone “hacking” scandal. It has to be someone convenient; definitely not the people who are actually responsible. That’d lose them votes. This is because those ultimately responsible are the readers of the tabloid newspapers with their insatiable appetite for the personal details of anyone famous, or in the news.

Readers of the Daily Mirror and the Sun/News of the Screws are mostly to blame, together with the Daily Mail, Express and “celebrity” magazines. They’re creating the demand; the publishers are in business to satisfy a demand. This isn’t to say I approve of the business – the cult of celebrity is one of the most rotten things about modern society – but blaming those making a living by never underestimating the public’s bad taste is like condemning a lion for eating an antelope. The tabloids are profitable; proper newspapers are a money pit.

But the politicians don’t want to blame the tabloid readers (aka most of the electorate), and neither does the news media want to blame their best customers. Instead they’re nervously jostling for position in a circular firing squad.

Politically, blaming the Murdoch Press is the best answer. Politicians would love to control the media, but in the west this is a tricky position to engineer. The fact that a sub-contracted investigator to one tabloid accessed the voice-mail of a missing person who subsequently turned out to have been murdered is a pretty flimsy pretext, but they appear to be making the most of it. Oh yes – they messed with a police investigation by deleting old messages. Hmm. My mobile ‘phone voicemail does this automatically – why blame the hack? Just convenient, and it makes it seem more shocking and no one is going to mention this obvious explanation as a possibility. This morning I heard Neil Kinnock suggesting the press needed regulating. Well it worked for Castro, Stalin and Kim Jung Il, his socialist role models?

Last weekend the News of the World was forced to close; a newspaper (in the broad sense of the word) was muzzled to cheers of delight. They were doing something illegal, and they had to go. Actually it was only made illegal in 2000 by Blair’s government (arguably it only came in to force in 2002). Prior to this it was dodgy ground, but there was always a public interest defence. This is key. Journalists used to be able to snoop on whoever they chose as long as it was in the public interest. Each individual case had to be argued on its merits; it was safe. Now journalists face a very real risk of prosecution simply for looking into the dealings of corrupt politicians, organised criminals and dodgy police officers (especially). New Labour’s idea is that only the police and security services were allowed to do anything like this – i.e. The state should have a monopoly on snooping. This is the same model used by the Gestapo, the KGB, the OVRA and the Stasi. It’s used in various countries in the modern world; there was no free press to hold the secret police and politicians to account.

Does this mean Blair and New Labour deserve to be lumped in with the dictatorial heads of police states? Probably not – they produced a large amount of stupid legislation in a hurry and I could well believe this was simple incompetence. However, it’s notable that politicians now are hardly lining up to condemn these totalitarian laws. Why would they? One of the major beneficiaries have been the politicians themselves, who like to have a protect “private life” outside the glare of publicity.

As a final note, watch for the Mirror – they were the subject of more complaints about illegal intercepts (by a long way) than The Sun, Screws or anyone else on Fleet Street (or Wapping). So far they’re being protected. If you think this is a conspiracy theory, check the complaints for yourself on the Ofcom web site. Don’t expect the news media to report it – not in their interests!

Google’s Evil Browser policy

Gmail Fail

Google’s VP of Engineering (Venkat Panchapakesan) has published one of the most outrageous policy statements I’ve seen in a long time – not in a press release, but in a blog post.

He’s saying that Google will discontinue support for all browsers that aren’t “modern” from the end of July, with the excuse that is developers need HTML5 before they can improve their offerings to meet current requirements. “Modern” means less than three versions old, which currently refers to anything prior to IE8 (now that IE 10 is available on beta) and Firefox 3.5. This is interesting – Firefox 4 has just been released, I’m beta testing Firefox 5 with Firefox 7 talked about by the end of 2011. This will obsolete last month’s release of Firefox 4 in just six months. Or does he mean something different by version number? Anyone who knows anything about software engineering will tell you that major differences can occur with minor version number changes too so it’s impossible to interpret what he means in a technical sense.

I doubt Google would be stupid enough to “upgrade” it’s search page. This will affect Google Apps and Gmail.

The fact is that about 20% of the world is using either IE 6 or a similar vintage browser. Microsoft and Mozilla have a policy of encouraging people to “upgrade” and are supportive of Google. Microsoft has commercial reasons for doing this; Mozilla’s motives are less clear – perhaps they just like to feel their latest creations are being appreciated somewhere.

What these technological evangelists completely fail to realise is that not everyone in the world wishes to use the “latest” bloated version of their software. Who wants their computer slowed down to a crawl using a browser that consumes four times as much RAM as the previous version? Not everyone’s laptop has the 2Gb of RAM needed to run the “modern” versions at a reasonable speed.

It’s completely disingenuous to talk about users “upgrading” – it can easily make older computers unusable. The software upgrade may be “free” but the hardware needed to run it could cost dear.

It’ll come as no surprise to learn that the third world has the highest usage of older browser versions; they’re using older hardware. And they’re using older versions of Windows (without strict license enforcement). There’s money to be made by forcing the pace of change, but it is right to make anything older than two years old obsolete?

But does Google have a point about HTML5? Well the “web developers” who’s blog comments they’ve allowed through uncensored seem to think so. But web developers are often just lusers with pretensions, fresh out of a lightweight college and dazzled by the latest cool gimmick. Let’s assume Google is a bit more savvie than that. So what’s their game? Advertising. Never forget it. Newer web technologies are driven by a desire to push adverts – Flash animations and HTML5 – everything. Standard HTML is fine for publishing standard information.

I’ll take a lot of convincing that Google’s decision isn’t to do with generating more advertising revenue at the expense of the less well-off Internet users across the globe. Corporate evil? It looks like it from here.

WPAD and Windows 7 and Internet Explorer 8

I’ve recently set up WPAD automatic proxy detection at a site – very useful if you’re using a proxy server for web access (squid in this case). However, some of the Windows 7 machines failed to work with it (actually, my laptop which is just about the only Windows 7 machine here). This is what I discovered:

It turns out that those smart guys at Microsoft have implemented a feature to stop checking for a WPAD server after a few failed attempts. It reckons it knows which network a roaming machine is on, and leaves a note for itself in the registry if it’s not going to bother looking for a proxy server on that again. A fat lot of use if you’ve only just implemented it.

If it fails to find a proxy, but manages to get to the outside world without one it will set the following key:


HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Wpad\
WpadDecision = 0

If you want it to try again (up to three times, presumably), you can simply delete this key. You can disable the whole crazy notion by adding a new the DWORD registry key:


HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Wpad\WpadOverride = 1

You may well want to do this if you’re using a VPN or similar, as I really don’t think Windows 7 has any completely reliable method of determining the network its connected to. I’m impressed that it manages to ever get it right, but I’m sure it’s easy enough to fool it. Does anyone know how it works?

Infosec Europe 2011 – worrying trend

Every Infosec (the Information Security show in London) seems to have have a theme. It’s not planned, it just happens. Last year it was encrypted USB sticks; in 2009 it was firewalls. 2011 was the year of standards.

As usual there were plenty of security related companies touting for business. Most of them claimed to do everything from penetration testing to anti-virus. But the trend seemed to be related to security standards instead of the usual technological silver bullets. Some of the companies were touting their own standards, others offering courses so you could get a piece of paper to comply with a standard, and yet others provided people (with aforementioned paper) to tick boxes for you to prove that you met the standard.

This is bad news. Security has nothing to do with standards; proving security has nothing to do with ticking boxes. Security is moving towards an industry reminiscent of Total Quality Assurance in the1990’s.

One thing I heard a lot was “There is a shortage of 20,000 people in IT security” and the response appears to be to dumb-down enough such that you can put someone on a training course to qualify them as a box-ticker. The people hiring “professionals” such as this won’t care – they’ll have a set of ticked boxes and a certificate that proves that any security breach was “not their fault” as they met the relevant standard.

Let’s hope the industry returns to actual security in 2012 – I’ll might even find merit in the technological fixes.

Google Phishing Tackle

In the old days you really needed to be a bit technology-savvy to implement a good phishing scam. You need a way of sending out emails, a web site for them to link back to that wouldn’t be blacklisted and couldn’t be traced, plus the ability to create an HTML form to capture and record the results.

Bank phishing scam form created using Google Apps
Creating a phishing scam form with Google Apps is so easy

These inconvenient barriers to entry have been swept away by Google Apps.

A few days back I received a phishing scam email pointing to a form hosted by Google. Within a couple of minutes of its arrival an abuse report was filed with the Google Apps team. You’d might expect them to deal with such matters, but this still hadn’t been actioned two days later.

If you want to have a go, the process is simple. Get a Gmail account, go to Google Docs and select “Create New…Form” from on the left. You can set up a data capture form for anything you like in seconds, and call back later to see what people have entered.

Such a service is simply dangerous, and Google doesn’t appear to be taking this at all seriously. Given their “natural language technology” it shouldn’t be hard for them to spot anything looking like a phishing form so, I decided to see how easy it was and tried something blatant. This is the result:

No problem! Last time I checked the form was still there, although I haven’t asked strangers to fill it in.