Don’t worry. I’m not getting into cryptography in any detail, and I’m going to try very hard not to mention entropy at all. There is so much confusion about passwords already, thanks to Hollywood movies and IT professionals parroting technobabble. I’m going to explain this in English.
What’s wrong with passwords?
If you’ve seen a cracker breaking into a computer on a TV programme, you’ll be familiar with the setup. Faced with a “login:” prompt, and imminent discovery by the guards walking down the corridor, they frantically type a few desperate things and suddenly the screen changes to “Downloading data, 15 seconds remaining”.
This is, of course, complete fiction. But how do crackers really steal passwords? Let’s assume they can’t guess it, because you haven’t used your kid’s name, “password” or “letmein” (the most common genius ideas from the 2000s). Weak passwords are still a problem, as is leaving a default password on something after installation. But there are ways you can lose hard-to-guess passwords too.
The first method is obvious. If you type in your password with someone looking over your shoulder, it’s no longer secret. This may seem obvious, but it’s also what a keyboard logger Trojan does. This simple piece of malware intercepts everything you type on your keyboard, passwords and all.
Most malware you’re likely to be infected with includes a key logger, or may download one once it’s started. Why wouldn’t it? They’re also found on PCs in Internet cafes around the world. It’s amazing how many people lose control of the Hotmail accounts after accessing their email on holiday.
If your password is grabbed by a key logger, it’s complexity really doesn’t matter. The traditional defense is to ensure you use different passwords for each system and change your passwords frequently. Changing your gmail password before the criminals do is unlikely.
There is another solution – two factor authentication (2FA). When you get down to it, there are two ways to prove you are you. One is something you know (e.g. a password), and the other is something you have (e.g. a key). If it helps, think about the them as being a combination lock and a key lock. And a lock doing both is A Good Thing.
You may think that having a key is a perfectly good option, as the key is (effectively) unique. No one else has the key. But supposing you lost it? With 2FA, no one can use you key without also knowing the combination. And if your combination became known, it’s useless without the key.
Another good example is chip-and-pin bank cards.
Incidentally, you may here people going on about MFA (Multi-factor authentication). What the third or subsequent factors may be is hard say, but for marketing purposes “multi” sounds better than “two”. (Bio-metrics are often cited as a third factor, but it’s effectively using your body as a key. In other words it’s still something you have).
But I’ve digressed. I was supposed to be talking about the second way of having your password stolen, and it’s also pretty simple: An attacker gets access to a computer containing a list of passwords, including yours.
Although it has been known to happen, there should never actually be such a list. That’d be crazy. If you don’t have a list of user-IDs and corresponding passwords, no one can steal it. if you do, they probably will.
But how does a computer know if you’ve entered your password if it doesn’t know what the password is supposed to be? That’s the cleaver bit.
What you do is keep a list of users, together with their hashed passwords. A hash is a code derived from your password, but which isn’t your password. When you log in, the computer derives the hash code from whatever you’ve entered and compares it with the stored hash – if they match then you entered the right password.
So how is a hash derived? How about an example. In our system a password is going to be a number, for simplicity. And I’ll call this number ‘p’ (for password). The resulting hash I will call ‘h’. Our hashing function (number 1) is going to be:
h = p x 7
Applying this to various passwords gives:
|User (stored)||Password (not stored)||Hash (stored)|
Table showing passwords hashed using trivial method
So, if Alice comes along and types her password as “321”, the computer hashes it and gets 2247. It then compares this with the stored hash, and open sesame.
If the user list is stolen, the thief won’t know Alice’s password is 321. Unless, of course, they divide the hash value by seven. Hash method 1 is pretty rubbish, as you can work it backwards.
In you divided by seven then you wouldn’t be able to work backwards to Alice’s password if you only stored the integer part. Or the modulus. But unfortunately, one in seven passwords entered would also match. Unless you pick a suitably complex number – how about Pi, and ignore the integer part. If we do this, we end up with the following:
|User (stored)||Password (not stored)||Hash (stored)|
Table showing passwords hashed using the improved algorithm
This is a much better hash, as you can’t reverse the method and retrieve the password. You can’t take Harry’s hash of 9915 and calculate what his password was. But, unfortunately, you can still work it out. If our passwords are all three digit numbers, there are only 1000 possible choices, and a computer could try them all in turn until if found a match. And this is why password complexity matters. If there are enough possible combinations it could take an unrealistic amount of time to try them all.
The next question to ask is “How many combinations are there?”, What you need I said at the start I’d keep the maths very simple, so you may want to skip this bit.
If you have a single character password that has to be a letter a-z, there are 26 possible combinations. That should be obvious. If you have two letters, the possible combinations are 26×26=676. Three letters is 26x26x26 (or 26^3)=17576 choices, and so on. In other words, if you take the number of possible characters and raise it to the power of the length you’ll have the total number of possible passwords. The following table gives the possible combinations for different lengths of password and sets of symbols.
|length||a-z||a-z,0-9||a-z,A-Z,0-9|| a-z, A-Z, 0-9,|
Table of possible permutations based on password complexity and length
If you’re not familiar with the number format 2E+09, it simply means 2 followed by nine zeros. When we’re talking about big numbers, the number of digits is going to be more useful.
On the face of it, the last column, including all the punctuation characters, is considerably better than a simple choice from a-z. But look more closely and you’ll notice that adding a few more simple characters quickly brings the number of combinations up. For example, an eight-character really complex password has a similar number of permutations to a simple ten-character one. Or a nine-character password if you add 0-9 to a-z.
I don’t know about you, but I’d rather type simple characters rather than messing about with shift, capital letters and punctuation. This puts pay to Myth Number 1: using punctuation and suchlike is necessarily better. The extra keystrokes hitting the Shift key are greater than if you stuck to lower-case.
Actually, it’s a lot worse than that. Everyone knows that people capitalize the first letter, use a $ instead of S and stick a ! on the end – or something similar. If they’re forced to change the password regularly they add 01, 02, 03… and so on to the end, which means an attacker can try such likely variations first.
So the characteristics of a good password are, simply, something that’s complex enough that it would take an unrealistic amount of time to brute-force, AND which is easy to type. Forget easy to remember; it’s got to be random. Passwords containing words to bulk out the length are much easier to crack, as words can be checked for early on.
So how complex does a password need to be? Well that depends on how fast an attacker can cycle through all the possible combinations. Using a computer, does 1000 guesses a second sound reasonable? How about a million? In Your Dreams. The fastest password guesser I know of in private hands can test 400,000,000,000 every second. That’s 4E+11. If you used the full symbol set, at random, a six-character password would take less than a second. If you simply have a rule saying “must contain two out of digits, upper-case letters or symbols”, and people have just one of each to satisfy the requirement, it’ll be substantially faster.
Put another way, a fully secure Microsoft-standard random password with no mistakes will take about five hours, maximum. You can bet nation states and serious cyber-criminals are going to be faster still; I wouldn’t be surprised if it was minutes or even seconds.
So how long if I want to be safe?
So how long should your password be? Well I’d like one that can’t be cracked in 1000 years as a minimum. That’s 3E+10 seconds. The cracker runs at 4E+11 a second, so multiply them together and you get around 1E+22 combinations needed.
From the table above, 16 random a-z characters is enough, or 15 characters if you add 0-9. If you want to include punctuation and so on, and you really, really, don’t mind mixing them in at complete random, then 12 will be enough. But this is a minimum, and you’ll probably have to add a character every year.
The smart answer is to abandon passwords and use certificates instead.