In episode 1 of “Password Insecurity” we examined some general basics about password security – or insecurity. There are 3 basic types of attack against passwords. These are:
We further discussed that choosing a good password is not trivial. In particular, humans are very bad at choosing randomness. In this episode we will elaborate more randomness. This article gives a lot of numbers and also some technical terms and acronyms about hash functions. Don’t be scared, it’s not so important ;)
Randomness is the absence of predictability. In respect to passwords this can mean: “can I predict the password” if I look at a specific person?
No. Really? Or did he choose the name of his first child and the wedding anniversary in combination? Or does he simply pick the name of his girl-friend and the current year? And to make the password “strong” he appends an exclamation mark at the end and then he makes it even stronger and capitalizes the first character…
As have already been discussed in the previous episode (Password Insecurity - Part 1), all these password modifiers do not really make the password stronger simply because the attacker can predict that the user does just this.
It is a general misconception that a password is more random if it looks random. The passwords “password”, “Password”, “Password2020”, “Password!”, “!password2020”, “p4ssw0rd”, “p4ssw0rd2020”,… are all more or less equal random (or not random) simply because the attacker can predict that the user takes some base word and does some common modifications.
The base word is found in the word list and the modifications can easily be calculated by a computer program.
The randomness of a password depends on the underlying key space. The key space is the set of all theoretical possible passwords. This does not necessarily mean that it depends on the number of characters that a word is made of.
Now pick a random word, the first that leaps into your mind!
Which one is it? Of course, the computer cannot really guess which word it was but it can assume and try. The active vocabulary of a German-speaking average person is about 70000 words (but other languages are similar). The chance is high that you picked one of these words. And even if we add a long list of names the chance is still high because with all names together it is just a few hundred more. You don’t have billions of names in your head ;)
Lets’ try to make the password more complex by adding numbers. The fact is, that people do not add any random number but a date or year which is familiar to them. If it is a year, mathematically we think this is a number with 4 digits, i.e. something between 1000 and 9999 but that doesn’t reflect reality. It is more likely that a number between 1960 and 2020 is chosen which are just 60 different numbers. As a result the size of the key space is approx. 70000 x 60 (and not 70000 x 9000 as one might expect).
Even if we try to use a special character to make the password more complex. And this is because we humans do not insert a random special character at any random position within the password. We pick a special character that can easily be reached on the keyboard and put it at the beginning or the end of the word. So in theory if we insert one of 15 different special characters at any random position into the password which is assumed to be a word made of 10 characters, we would increase the key space by a multiple of 150. But because we use only the first or last position and use just a subset of all available special characters, let’s say 5, the key space increases just by a multiple of 10. To resist against GPU attacks, as briefly discussed in the previous episode, password complexity has to be increased by orders of magnitude and not just by 10, 20, or 100.
It can often be seen that absurd password policies are applied, such as “Pick at least 10 characters.”, and “Use two capital letters.”, and “Use a special character.”, and “The password must be different than the last 3 passwords.”, and “…”, and the user must change it every 3 months. This may make the whole situation even worse. Because of the complex and annoying situation people will choose an extra simple base word and then will somehow try to fit those other necessary characters around it to match the policy as simple as possible.
Furthermore, the use of mobile devices is state of the art today. Have you ever tried to enter a password compiled of various characters, numbers, and special characters on a touch display?
To increase password complexity, you have to truly increase the underlying key space.
Again, humans are really bad in choosing randomness. You need a computer to pick a random password. There are 2 options:
A list-of-words password could be e.g. “peter old hollis the” or “one gold list wave”. Both are lists of 4 words chosen randomly. There are several generators but these two sample word list passwords where generated by Wikipedia. The English Wikipedia currently contains about 6 million pages. If we take only the first word of every page title and remove duplicates, still 2 million (2 · 106) words remain. Of course there are many abbreviations, acronyms and other special words one would not really consider as being a word. So let’s further assume just 10% are real words. It is still 200000 (2 · 105). There is the Random Article feature in the menu on the left. Just choose 4 random articles and use the first word of each as a sequence of words. Mathematically this gives a key space of 2000005 which is 1.6 · 1021 under the assumption that the attacker even knows that you use a list of 4 words chosen by the Wikipedia1. We will see immediately what that means if the attacker does not have this knowledge.
Using Wikipedia is just an example. It is also valid to use e.g. a hardcopy of a dictionary. You may also use any other book. Randomly open 4 different pages and choose the first word on each page. Although the latter may yield a smaller key space – dependent on the author – because the number of different words in an average novel will presumably be much lower. Thus, it may be wise to use 5 or 6 words instead of 4.
The big advantage of this password type is that it is relatively easy to remember and it is easy to type.
The second solution would be to choose a random sequence of characters using a password generator. The final key space depends on the size of the alphabet. This is the number of different characters per digit, e.g. all lowercase characters, or all lowercase and uppercase characters, or numerals, and/or special characters additionally.
The size of the final key space is the result of the size of the alphabet to the power of the number of digits. E.g. if we choose a password with 12 random digits chosen from all lowercase letters and the numerals 0 – 9, the alphabet consists of 36 different characters. Therefore the size of the key space is 3612 = 4.7 · 1018.
How much is this? If we assume that we cover the surface of the earth with raisins, one by one, then three quarters of Earth’s surface would be covered. Sounds pretty much ;)
Under the previously discussed constraints the length of the password determines the size of the key space. Just again, the previous chapter should have made clear that the key element of the strength is “randomly chosen”!
Table 1 gives an estimation about the key spaces dependent on the size of the input alphabet. In computer science instead of giving the number of elements of an entity usually the bit length is used2 which is shown in the last column. It is just another unit of measurement for the same thing.
Alphabet | Size of alphabet | Number of words/characters | Size of the key space | Bit length of the key space (approx.) |
Words of a Language (e.g. Wikipedia) | 200000 | 4 | 1.6 · 1021 | 70 |
|
| 5 | 3.2 · 1026 | 88 |
Lowercase letters | 26 | 10 | 1.4 · 1014 | 47 |
|
| 16 | 4.4 · 1022 | 75 |
|
| 20 | 2.0 · 1028 | 94 |
Lowercase letters and numerals | 36 | 10 | 3.6 · 1015 | 52 |
|
| 12 | 4.7 · 1018 | 62 |
|
| 16 | 8.0 · 1024 | 83 |
|
| 20 | 1.3 · 1031 | 103 |
Lowercase, uppercase and numerals | 62 | 10 | 8.4 · 1017 | 60 |
|
| 12 | 3.2 · 1021 | 71 |
|
| 16 | 4.7 · 1028 | 95 |
|
| 20 | 7.0 · 1035 | 119 |
Lower-, uppercase, numerals, and special characters | 94 | 10 | 5.4 · 1019 | 66 |
|
| 12 | 4.8 · 1023 | 79 |
|
| 16 | 3.7 · 1031 | 105 |
|
| 20 | 2.9 · 1039 | 131 |
Table 1: Comparison of Password Complexity
It should further be mentioned that the list-of-words passwords is a subset of the random character password. More specifically that means that e.g. the password “peter old hollis the” from above is a list of 4 random words but also a list of 20 characters. If the attacker knows (or suspects) that it is a list of words he can assume a complexity of 70 bits (according to the table 1). Without that knowledge a random character password has to be assumed. Given the smallest usual alphabet being just lowercase letters produces a bit size of 94 bits which is much larger and thereby takes longer to trawl through.
A pretty important finding should become obvious if we look a little bit closer to the numbers in table 1. A random 10-digit password with lowercase, uppercase, numerals, and special characters creates a key space of 66 bits while a 12-digit password with lowercase, uppercase, and numerals but no special characters creates a key space of 71 bits which is more (hence “better”) than 66, an 11-digit password lower/upper/numerals has 65 bits. Thus, we get approx. the same complexity if we randomly choose an 11-digit password without special characters compared to a 10-digit password with special characters. This is noted because the use of special characters gives a lower gain of password complexity than generally is believed. Also the input of special characters may be more difficult specifically if we take soft-keyboards on mobile devices into account.
The size of a key space – and that a password is randomly chosen out of this key space – is vital since it determines the maximum time an attacker may need to find the password. This is because with the brute-force attack the attacker tries all possible passwords of a specific key space. We should be aware that after half of the maximum time he already recovered a password with the probability of 50%. And of course with the luck of winning a lottery jackpot he could find the password within seconds. This is how long Stephen Mills needed to open the safe with a lost combination: »No one could open a safe for 40 years. A tourist cracked it on his first try.«.
The maximum cracking time does not just depend on the key space. It also significantly depends on the type of password hash and of course the computing power of the attacker. The latter simply is a matter of capital investment.
Table 2 gives a comparison of various hash values and their strength in respect to brute-force-resistance. These numbers have been measure with the same NVidia RTX 2070 Super as has been used in the previous article. It is currently sold at Amazon starting at € 550,- . The newer model RTX 2080 has an estimated performance gain of approx. 10% and is sold starting at € 750,- .
| Brute-force | performance | 32 | 40 | 48 | 64 |
NTLM (MS Windows) | weak | 17 billion | < 1s | 64 s | 4,6 h | 34 y |
WPA-PBKDF2 (Wifi) | moderate | 500000 | 2,3 h | 25 d | 17 y | 1,1 mio |
Salted SHA512 (Linux) | strong | 43000 | 28 h | 296 h | 207 y | 13 mio |
Salted Blowfish crypt (OpenBSD) | very strong | 8000 | 149 h | 4,4 y | 1116 y | 73 mio |
Table 2: Password Cracking Duration Comparison
Computing power increases linearly. This means if two GPUs are used passwords are cracked twice as fast and 10 GPUs in parallel work ten times faster.
As table 2 shows there is a huge difference between the hash functions. This in turn means that there is no perfect password length, i.e. a password that may be suitable for the Wifi network may not necessarily be suitable for a Windows login.
How to read this table? Let’s assume we are looking for a secure Wifi password. According to table 2 a password chosen from a 48 bit key space needs 17 years to be cracked or 8.5 years to have it cracked with a probability of 50%. Let’s further assume one has 50 such GPUs then it is still 2 months. Is your infrastructure worth to wait such a long time? Are there no other attack vectors which could be exploited more easy?
No, so let’s consider 48 bits of randomness to be enough. Now choose a password scheme from table 1 with a bit length of at least 48 bits. This is e.g. 10 digits of lowercase and numerals which has a bit length of 52 bit.
Furthermore we should take into account that computing power is steadily increasing. We cannot make any general statement about how fast hashes may be cracked in the future but during the last decade the power increased by a factor between 2 and 20. This performance increase can very simple be thwarted by adding a single digit (or more of course) to the password.
Windows NTLM hashes are much weaker hence we would suggest to use at least 12 digits of randomly chosen lowercase, uppercase, and numerals.
Now we know more about randomness and how it is associated to password security. The basic conclusion is that it is vital to choose a true random password with a sufficient length. What a “sufficient length” is depends highly on the use-case, i.e. Windows passwords should be longer than other passwords may be because of the weak method used to store passwords.
Furthermore, assume that we choose a very secure random password and we use this same password on multiple platforms and services. In the field of security there is the paradigm to always be as paranoid as possible. So let’s further assume that somebody, however, was able to crack your super secure password (maybe through a different type of attack). He can then easily login to all these services where you re-used the same password.
Having different secure passwords is a perfect measure to prevent against this. And this further leads to password managers as well as multi-factor authentication which was already mentioned at the beginning. We will explore these topics in the next episodes of this article series.
Bernhard R. Fischer started his career as a network engineer responsible for building up a national IP network and later developing and maintaining the major Internet services for this network such as DNS and Email. After his studies of business economics and information systems on the University of Vienna he became a researcher and teaching fellow on the University of Applied Sciences of St. Pölten in the fields of computer networks, operating systems, and IT security. Since 2017 he is employed as a senior security consultant at Antares-NetlogiX GmbH.
During his whole career Bernhard contributed to a lot of public projects and discussions, attended conferences as a speaker, and voluntarily supported numerous open source projects not just as an advocate but as software engineer and developer. He always placed emphasis on robustness and quality of software and systems. He painstakingly investigates the details of everything and he willingly passes his knowledge to everybody who is interested.
Some of his own public projects can be found on Github, some articles and opinions on his personal Tech & Society Blog, and on Twitter. Bernhard is also into sailboat sailing and runs his own professional web site and produces the technical podcast Schiff – Captain – Mannschaft which is about sailing, seamanship and sailboats.
[1] Wikipedia does not really use a cryptographically secure pseudo random number generator but we can expect that people all over the world clicking on Random Article make it a random function.
[2] Technically this is the binary logarithm of the size of the key space.