Following my discussion of password lengths, it appears that NIST are concerned about naughty people brute-forcing hashes in stolen password lists. In order to make it more difficult, they’re recommending long passwords, like me. But unlike me, they’re suggesting that they should be made up of randomly chosen words to make them easier to remember because the users could connect them in their mind.
NCSC have their own version, which is basically the same. Originally it was suggested four words would be enough, but it’s often reported as just three.
Leaving aside the chances of someone actually picking random words that are genuinely random, I was curious as to how difficult this would be to crack.
Instead of having the 26 letters of the alphabet to play with, using words gives you a lot more symbols. I wondered how many, so I looked in the obvious place – the BSD spell-check dictionary. That’s a lot of words, but realistically you’d want them to be between three and nine characters long. If you make them too long it’ll take more time to type than people will be willing to spend. Also, looking at longer words, they tend to be made up of smaller ones, or common letter combinations (“anti”, “un”, “ly”, “able”).
So, with a bit of awk, I extracted all the words between three and nine characters. There are a LOT. But on closer inspection, they’re not words I’d have ever picked at random, mainly because I’ve never heard of them. They’re the kind of combinations that cause arguments among scrabble players.
To get some idea how many realistic words there were I extracted 200 at random, and went through to pick the ones I knew. I let proper names count, and it turns out that I knew about 20% of them overall. Whether I’d pick many of them if asked to think of a word at random is another matter; and one worthy of study. However, best-case this extrapolates to about 18,000 unique words in the symbol set. So for a three-word combination the best you’re going to get are 18,000^3 or 6E+12. That’s as good as an nine-character a-z password. A nine-character password can be broken in 14 seconds maximum.
Sorry NCSC, but I don’t like these odds. Let’s go up to the four-word version: That has 1E+17 combinations, which could take a couple of months to crack. But this is still being very optimistic about the choice of words being random.
The rationale behind using a passphrase is that it’s easier for users to remember, and this is a good point. People can create a mind map; a short story or scene using the four chosen words. It is certainly easier than remembering a sequence of random symbols. You can also create a mind map using symbols by giving them meaning (1 = flagpole, 2=swan) etc. But this misses an important point – the sheer number of passwords people have in the modern world. I suggest most people would struggle remembering more than half a dozen mind maps, yet probably have well over 100 unique passwords. The only way to manage so many unique passwords is to store them somewhere, encrypted with one master password.
In short, a passphrase instead of a password only makes sense if its pretty darn long. If you’re going for entropy, a random character password – if you can remember it – will be much, much quicker to type.