Passwords

Table of contents
Protecting Your Password
Shoulder Surfing
Assume that as you are typing your password, someone peeks over your shoulder and notices that you have typed 6 lowercase letters, but couldn’t tell which ones. How easy would it be for them to guess your password?
They can brute-force it by trying all possible combinations of 6 lowercase letters until they find the correct one. Since there are 26 possible lowercase letters, there are:
\[26^6 = 308,915,776\approx 309 \textrm{ million}\]different possible passwords made of 6 lowercase letters. This might sound safe. After all, any website is likely to lock them out if they provide an incorrect password a few times!
A PYTHON NOTE
Instead of using
input()to read passwords in your Python programs, you can use thegetpassmodule, which hides the password as you type it. Here is a simple example:import getpass password = getpass.getpass("Enter your password: ") print("You entered:", password)
Brute-Force Attacks
Imagine if the password database of your website was leaked and the attacker had access to it (this could happen even with top websites!). Typically, websites store passwords encrypted. Hence, your password should still be unreadable by the attacker. However, the attacker now has an easy way to crack your password!
Knowing your encrypted password, they can try to encrypt the \(308,915,776\) possible passwords and see if any match your encrypted password. This can be done on a laptop in a matter of hours, and on a stronger computer with multiple GPUs in a matter of seconds or less!
Protecting Your Password
Most websites currently require passwords to be at least 8 characters long, and to contain a combination of uppercase letters, lowercase letters, numbers, and special characters. How secure does this make the password?
Do The Math
If the password has 8 lowercase characters, the number of possible passwords is:
\[26^8 = 208,827,064,576\approx 209 \textrm{ billion}\]If the password has also uppercase characters, the number is:
\[(26 + 26)^8 = 53,459,729,000,000\approx 53 \textrm{ trillion}\]If it also has numbers:
\[(26 + 26 + 10)^8 = 218,340,110,000,000\approx 218 \textrm{ trillion}\]Assume also that there is also one of 32 special characters:
\[(26 + 26 + 10 + 32)^8 = 6,095,689,400,000,000\approx 6 \textrm{ quadrillion}\]
Adding such requirements makes it practically impossible even for the fastest computers to crack the password using brute-force attacks.
🔗 LINK
The following website provides estimates for how long it takes to crack your password! https://bitwarden.com/password-strength/
While the above requirements make passwords more secure, they also make them harder to remember. People tend to use the same password across multiple websites, or use small variations of the same password (e.g., Password1!, Password_1234, etc.). This makes it easier for attackers to guess passwords using more sophisticated methods than brute-force attacks.
Dictionary Attacks
Passwords that are easy to remember often follow common patterns or use common words. Therefore, attackers try to exploit this by using a predefined list of common passwords (a dictionary). Instead of trying all possible combinations of characters, they try to encrypt each password from the dictionary and see if it matches the encrypted password they are trying to crack. This can be much faster than brute-force attacks, and surprisingly effective!
📖 Definition
A Dictionary Attack is an attempt to guess the password based on a predefined list of strings. The list can contain English words (hence the name dictionary), common strings found in passwords (e.g.,
password,1234), or actual passwords collected from previous breaches. The attacker typically uses software to create variations of the strings in the dictionary (e.g., appending a number or a special character).
📖 Definition
A Rainbow Table is a dictionary, where the strings (e.g., the common passwords) are not stored in plaintext, but are stored after being encrypted using one of the encryption methods commonly used for passwords. This can speed up the dictionary attack, as the the attacker only has to search for the encrypted string, instead of encrypting it before searching.
🔗 OPTIONAL LINK
Watch a live demo of dictionary attacks: Password Cracking on Computerphile
Storing Passwords
No sane application stores its users’ passwords in plaintext. Not even the application administrators should be able to read such sensitive information. Let alone the possibility of hackers getting unauthorized access to the files storing these passwords. The minimum that should be done is to store the passwords in some encrypted format.
How can the application authorize users if they don’t know their passwords?
The typical procedure is as follows:
- When a user creates an account or changes their password, the application encrypts the password and stores it.
- When the user tries to log in, the application encrypts the password they provide and compares it to the stored encrypted version. If they match, the user is authorized.
Hashing Passwords
For the above procedure to be secure, passwords must be encrypted using a one-way method that does not allow recovering the original password from its encrypted version. For example, the application should not use an encryption method like one-time pads, because given the encrypted password and the key, it is possible to recover the original password.
For storing passwords, cryptographic hashing is typically used. A cryptographic hash function takes an input (in our case, the password) and produces a fixed-size string of characters, which appears random. There is no key involved in this process and it is computationally infeasible to reverse the process (i.e., to recover the original password from its hashed version).
The following code snippet shows how to use the hashlib library to generate a hashed version of a password. This piece of code uses a hash function named SHA256.
import hashlib
def hash_password(password):
return hashlib.sha256(password.encode()).hexdigest()
print(' mypassword: ', hash_password("mypassword"))
print(' mypassword: ', hash_password("mypassword")) # same output as above
print('mypassword1: ', hash_password("mypassword1")) # different output
print(' 1: ', hash_password("1")) # different output
print(' 2: ', hash_password("2")) # different output
Note that the same password always produces the same hash, while different passwords produce different hashes. Note also that even a small change in the password (e.g., adding a 1 at the end) produces a completely different hash.
Is This Enough?
Imagine an attacker who is collecting usernames and hashed passwords from many different leaked databases. Now, assume that they notice the same hashed password appearing multiple times, associated with different usernames. What can they conclude from this?
If the attacker successfully cracks one of the passwords (e.g., using a dictionary attack), they can immediately know the passwords of all other users who have the same hashed password! Therefore, it is important to make sure that even if two users have the same password, their hashed passwords look different. This is typically done using a technique called salting.
Applications typically add a salt to every password before hashing it. A salt is an extra random string (different for different users) added to the password before hashing it and stored alongside the password (often in plaintext). This way, if two users use the same password, their hashes will look different, because their salts are different.
Here is the typical procedure for using salts:
- When a user creates an account, the application generates a random salt (a random sequence of characters).
- The application creates a new string made by appending the salt to the user’s password.
- The application hashes the resulting string and stores both the salt and the hashed string.
When the user tries to log in:
- The application retrieves the stored salt for that user.
- Creates a string by appending the salt to the password provided by the user.
- Hashes the resulting string, and compares it to the stored hashed string for the user.
This way, even if two users have the same password, their hashed passwords will look different because of the different salts.
Making It Even More Secure
Some cryptographic hash functions (like SHA256) are very fast. While this might sound good, it is bad for security reasons. A slow hash function makes the job of a hacker harder, because the time needed for a brute-force attack not only depends on how many passwords will be tested, but also how long it takes to test a single password! Therefore, it is advisable to use libraries like bcrypt and scrypt, which provide slow (i.e., more secure) hashing algorithms.
Summary
Here is a summary of best practices for storing passwords securely:
- Store passwords encrypted.
- Use a one-way hash function for encryption.
- Use a unique salt for each password before hashing.
- Use a slow hash function designed for password storage (e.g., bcrypt, scrypt).
🔗 OPTIONAL LINK
The following video summarizes nicely how passwords should be stored: Studying With Alex: Password Storage Tier List