Why passphrases are a bad idea

Following my discussion of password lengths, it appears that NIST are concerned about naughty people brute-forcing hashes in stolen password lists. In order to make it more difficult, they’re recommending long passwords, like me. But unlike me, they’re suggesting that they should be made up of randomly chosen words to make them easier to remember because the users could connect them in their mind.

NCSC have their own version, which is basically the same. Originally it was suggested four words would be enough, but it’s often reported as just three.

Leaving aside the chances of someone actually picking random words that are genuinely random, I was curious as to how difficult this would be to crack.

Instead of having the 26 letters of the alphabet to play with, using words gives you a lot more symbols. I wondered how many, so I looked in the obvious place – the BSD spell-check dictionary. That’s a lot of words, but realistically you’d want them to be between three and nine characters long. If you make them too long it’ll take more time to type than people will be willing to spend. Also, looking at longer words, they tend to be made up of smaller ones, or common letter combinations (“anti”, “un”, “ly”, “able”).

So, with a bit of awk, I extracted all the words between tree and nine characters. There are a LOT. But on closer inspection, they’re not words I’d ave ever picked at random, mainly because I’ve never heard of them. They’re the kind of combinations that cause arguments among scrabble players.

To get some idea how many realistic words there were I extracted 200 at random, and went through to pick the ones I knew. I let proper names count, and it turns out that I knew about 20% of them overall. Whether I’d pick many of them if asked to think of a word at random is another matter; and one worthy of study. However, best-case this extrapolates to about 18,000 unique words in the symbol set. So for a three-word combination the best you’re going to get are 18,000^3 or 6E+12. That’s as good as an 9-character a-z password. Which can be broken in 14 seconds maximum.

Sorry NCSC, but I don’t like these odds. Let’s go up to the four-word version: That has 1E+17 combinations, which could take a couple of months to crack. And that’s being very optimistic about the choice of words being random.

In short, a passphrase instead of a password only makes sense if its pretty darn long. If you’re going for entropy, a random character password – if you can remember it – will be much, much quicker to type.

How long should my password be?

Don’t worry. I’m not getting into cryptography in any detail, and I’m going to try very hard not to mention entropy at all. There is so much confusion about passwords already, thanks to Hollywood movies and IT professionals parroting technobabble. I’m going to explain this in English.

What’s wrong with passwords?

If you’ve seen a cracker breaking into a computer on a TV programme, you’ll be familiar with the setup. Faced with a “login:” prompt, and imminent discovery by the guards walking down the corridor, they frantically type a few desperate things and suddenly the screen changes to “Downloading data, 15 seconds remaining”.

This is, of course, complete fiction. But how do crackers really steal passwords? Let’s assume they can’t guess it, because you haven’t used your kid’s name, “password” or “letmein” (the most common genius ideas from the 2000s). Weak passwords are still a problem, as is leaving a default password on something after installation. But there are ways you can lose hard-to-guess passwords too.

Password “sniffing”

The first method is obvious. If you type in your password with someone looking over your shoulder, it’s no longer secret. This may seem obvious, but it’s also what a keyboard logger Trojan does. This simple piece of malware intercepts everything you type on your keyboard, passwords and all.

Most malware you’re likely to be infected with includes a key logger, or may download one once it’s started. Why wouldn’t it? They’re also found on PCs in Internet cafes around the world. It’s amazing how many people lose control of the Hotmail accounts after accessing their email on holiday.

If your password is grabbed by a key logger, it’s complexity really doesn’t matter. The traditional defense is to ensure you use different passwords for each system and change your passwords frequently. Changing your gmail password before the criminals do is unlikely.

There is another solution – two factor authentication (2FA). When you get down to it, there are two ways to prove you are you. One is something you know (e.g. a password), and the other is something you have (e.g. a key). If it helps, think about the them as being a combination lock and a key lock. And a lock doing both is A Good Thing.

You may think that having a key is a perfectly good option, as the key is (effectively) unique. No one else has the key. But supposing you lost it? With 2FA, no one can use you key without also knowing the combination. And if your combination became known, it’s useless without the key.

Another good example is chip-and-pin bank cards.

Incidentally, you may here people going on about MFA (Multi-factor authentication). What the third or subsequent factors may be is hard say, but for marketing purposes “multi” sounds better than “two”. (Bio-metrics are often cited as a third factor, but it’s effectively using your body as a key. In other words it’s still something you have).

Wholesale pilfering

But I’ve digressed. I was supposed to be talking about the second way of having your password stolen, and it’s also pretty simple: An attacker gets access to a computer containing a list of passwords, including yours.

Although it has been known to happen, there should never actually be such a list. That’d be crazy. If you don’t have a list of user-IDs and corresponding passwords, no one can steal it. if you do, they probably will.

But how does a computer know if you’ve entered your password if it doesn’t know what the password is supposed to be? That’s the cleaver bit.

What you do is keep a list of users, together with their hashed passwords. A hash is a code derived from your password, but which isn’t your password. When you log in, the computer derives the hash code from whatever you’ve entered and compares it with the stored hash – if they match then you entered the right password.

So how is a hash derived? How about an example. In our system a password is going to be a number, for simplicity. And I’ll call this number ‘p’ (for password). The resulting hash I will call ‘h’. Our hashing function (number 1) is going to be:

h = p x 7

Applying this to various passwords gives:

User (stored)Password (not stored)Hash (stored)
Tom 123 0861
Dick 200 1400
Alice 321 2247
Jane 567 3969
Table showing passwords hashed using trivial method

So, if Alice comes along and types her password as “321”, the computer hashes it and gets 2247. It then compares this with the stored hash, and open sesame.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

If the user list is stolen, the thief won’t know Alice’s password is 321. Unless, of course, they divide the hash value by seven. Hash method 1 is pretty rubbish, as you can work it backwards.

In you divided by seven then you wouldn’t be able to work backwards to Alice’s password if you only stored the integer part. Or the modulus. But unfortunately, one in seven passwords entered would also match. Unless you pick a suitably complex number – how about Pi, and ignore the integer part. If we do this, we end up with the following:

User (stored)Password (not stored)Hash (stored)
Tom 123 1521
Dick 200 6619
Alice 321 1774
Jane 567 4817
Harry???9915
Table showing passwords hashed using the improved algorithm

This is a much better hash, as you can’t reverse the method and retrieve the password. You can’t take Harry’s hash of 9915 and calculate what his password was. But, unfortunately, you can still work it out. If our passwords are all three digit numbers, there are only 1000 possible choices, and a computer could try them all in turn until if found a match. And this is why password complexity matters. If there are enough possible combinations it could take an unrealistic amount of time to try them all.

The next question to ask is “How many combinations are there?”, What you need I said at the start I’d keep the maths very simple, so you may want to skip this bit.

If you have a single character password that has to be a letter a-z, there are 26 possible combinations. That should be obvious. If you have two letters, the possible combinations are 26×26=676. Three letters is 26x26x26 (or 26^3)=17576 choices, and so on. In other words, if you take the number of possible characters and raise it to the power of the length you’ll have the total number of possible passwords. The following table gives the possible combinations for different lengths of password and sets of symbols.

lengtha-za-z,0-9a-z,A-Z,0-9 a-z, A-Z, 0-9,
~!@#$%^&*_-+=`

|(){}[]:;”‘<>,.?/
126365296
2676129627049216
31757646656140608884736
44569761679616731161684934656
51E+076E+074E+088E+09
63E+082E+092E+108E+11
78E+098E+101E+128E+13
82E+113E+125E+137E+15
95E+121E+143E+157E+17
101E+144E+151E+177E+19
114E+151E+178E+186E+21
121E+175E+184E+206E+23
132E+182E+202E+226E+25
146E+196E+211E+246E+27
152E+212E+235E+255E+29
164E+228E+243E+275E+31
Table of possible permutations based on password complexity and length

If you’re not familiar with the number format 2E+09, it simply means 2 followed by nine zeros. When we’re talking about big numbers, the number of digits is going to be more useful.

On the face of it, the last column, including all the punctuation characters, is considerably better than a simple choice from a-z. But look more closely and you’ll notice that adding a few more simple characters quickly brings the number of combinations up. For example, an eight-character really complex password has a similar number of permutations to a simple ten-character one. Or a nine-character password if you add 0-9 to a-z.

I don’t know about you, but I’d rather type simple characters rather than messing about with shift, capital letters and punctuation. This puts pay to Myth Number 1: using punctuation and suchlike is necessarily better. The extra keystrokes hitting the Shift key are greater than if you stuck to lower-case.

Actually, it’s a lot worse than that. Everyone knows that people capitalize the first letter, use a $ instead of S and stick a ! on the end – or something similar. If they’re forced to change the password regularly they add 01, 02, 03… and so on to the end, which means an attacker can try such likely variations first.

So the characteristics of a good password are, simply, something that’s complex enough that it would take an unrealistic amount of time to brute-force, AND which is easy to type. Forget easy to remember; it’s got to be random. Passwords containing words to bulk out the length are much easier to crack, as words can be checked for early on.

So how complex does a password need to be? Well that depends on how fast an attacker can cycle through all the possible combinations. Using a computer, does 1000 guesses a second sound reasonable? How about a million? In Your Dreams. The fastest password guesser I know of in private hands can test 400,000,000,000 every second. That’s 4E+11. If you used the full symbol set, at random, a six-character password would take less than a second. If you simply have a rule saying “must contain two out of digits, upper-case letters or symbols”, and people have just one of each to satisfy the requirement, it’ll be substantially faster.

Put another way, a fully secure Microsoft-standard random password with no mistakes will take about five hours, maximum. You can bet nation states and serious cyber-criminals are going to be faster still; I wouldn’t be surprised if it was minutes or even seconds.

So how long if I want to be safe?

So how long should your password be? Well I’d like one that can’t be cracked in 1000 years as a minimum. That’s 3E+10 seconds. The cracker runs at 4E+11 a second, so multiply them together and you get around 1E+22 combinations needed.

From the table above, 16 random a-z characters is enough, or 15 characters if you add 0-9. If you want to include punctuation and so on, and you really, really, don’t mind mixing them in at complete random, then 12 will be enough. But this is a minimum, and you’ll probably have to add a character every year.

The smart answer is to abandon passwords and use certificates instead.

eBay security problem in February – just noticed!

Well, it had to happen. Today eBay announced¬†a serious security compromise. Apparently¬†someone’s got hold of employee login details that allowed access to databases containing customer names and contact details, together with a password hashes.

Should anyone be worried?

Well, a hashed password isn’t a password but it’s possible to crack, especially if it was a weak one (i.e. a word or two words conflated, with a digit on the end and possibly a full stop). eBay says that there’s no evidence of anything fraudulent transactions. Yeah, great. The problem is going to come when people have used the same password elsewhere, like on their PayPal account, bank account or somewhere important – armed with their contact details and a crackable password, those people could be in real trouble.

eBay is due to email everyone very soon to ask them to change their password. It’s called shutting the stable door once the horse has bolted – this data may have been in the hands of the criminals for a couple of months now. You don’t need to change your eBay password; you need to change the password on every system that used it.

The sooner this antiquated means of verifying identity was replaced by secure public certificates, the better – by the punters won’t understand how those work.

So what does this mean? Your password was secure but now it isn’t? No. It was only secure before if you trusted the eBay employees. And a find upstanding bunch they are.

Next, of course, the scammers are going to spam everyone with phishing eBay credential change emails. And when this hits the news, who’s going to disbelieve it. eBay really needed to manage the news dissemination better.