PRISON OF MIRRORS We're not here because we're free. We're here because we're not free.

Monthly Archives: July 2012

An Introduction to Password Hashing and Storage

July 19, 2012

With all the recent password leaks and other security-related blunders, I decided to talk about password hashing and explain it so non-computer savvy people can understand. Hopefully people will gain a better understanding about storing passwords and how to create good ones.

Plain Text

You might have heard the term “plain text” when hearing about passwords or one of the recent leaks (or the mega-Sony leak of last year). This is sometimes referred to as clear text. So what does this mean? It simply means the password is stored so any person can read it. Like so:

Passw0rd

The dangers of this should be pretty obvious: anyone who has access to the password knows what it is, and considering that in a database an user name — or even worse, an email address — is usually accompanying the password, the dangers are even more paramount (especially if you’re one of those people who reuse passwords. Hint: don’t do that). And if a database is leaked… game over. So how do we prevent someone who has access to the password from knowing what it is? This where hashing comes in.

Hashing

Hashing is a way of obfuscating text. It can be any text, but it’s commonly used for passwords. It’s a one-way algorithm that literally transforms the password into a fixed-length string of random letters and numbers (in hexadecimal, meaning each “digit” goes from 0 to 15, which is represented by 0 to 9, and a to f). As just mentioned, this algorithm is one-way, so there’s no way to reverse the hash. Passwords at the bare minimum should be stored as hashed values in the database and the plain text versions should never be stored under any circumstances.

There are many different hashing algorithms. Some are better than others. The two most common ones are MD5 and SHA-1. Neither of these algorithms are considered secure anymore but they are still widely used. Each hashing algorithm is one-way and always produces a fixed-length string of characters, known as a string. MD5 outputs a 32 character string and SHA-1 outputs a 40 character string.

Let’s take a look at the MD5 output for Passw0rd, the example used above:

d41e98d1eafa6d6011d3a70f1a5b92f0

As one can see, there’s no real way to tell what password this is just by looking at it. It is important to note that each algorithm always produces the same output for the same input. In other words, every time one enters Passw0rd, the result will always be what’s above. This is how websites and other applications are able to authenticate users. If the user enters in an incorrect password, the hashed output will be different than what’s in the database and therefore will not be authenticated.

Another important thing to note is that input that is sequential or closely related will result in a drastically different output. Here’s a quick example using SHA-1:

Passw0rd is ebfc7910077770c8340f63cd2dca2ac1f120444f in SHA-1
passw0rd is 7c6a61c68ef8b9b6b061b28c348bc1ed7921cb53 in SHA-1

As you can see, changing just the first letter from uppercase to lowercase has a drastic effect on the output, and the two do not look similar at all. This helps make the algorithms stronger.

Predictable Output

While the output isn’t predictable to humans, it certainly is to computers. In an effort to “crack” hashed passwords people have created “dictionaries” and what’s known as rainbow tables to cover as many hash possibilities as possible. These dictionaries can number in the billions, and you can bet that the common passwords people choose on a regular basis are in there. There are rainbow tables that cover all combinations of upper and lowercase letters, numbers, and symbols up to eight characters for the MD5 algorithm. This is why it’s so important not to use some simple password like J@son123.

This suddenly doesn’t sound very secure. How do we mitigate the effects of precomputed hash dictionaries and rainbow tables? We use what’s known as a salt.

Salting

Salting is adding an additional string to the password, and then hashing the new string. Salts are usually appended to the front or end of the password. There are a few caveats about salts, though. The same salt should never be used more than once and should always be unique, changing every time the user changes his password. The salt should also be long and random, probably the same length as the hash algorithm’s output. The longer and more random, the stronger it is. I personally like to use a combination of PHP’s mt_rand() function (without arguments), which generates a random number between 0 and 2147483647, and the current date and time, formatted in a specific way, accurate to the second.

Adding a salt has two major strengths: 1) it makes the password much harder to guess, and a good salt would render hash dictionaries and rainbow tables useless, forcing an attacker to use brute-force techniques which could take centuries or longer, and 2) if two users are using the same password, the hashes that are stored in the database will look completely different. Since it’s expected that a lot of common passwords are reused, many unsalted hashes in a large database would be the same. An attacker would only have to crack it once, though, for all of the same password to be revealed. Adding a salt makes this impossible.

Simply put, adding a salt is just adding a string that’s so long, so ridiculously random, that it’s improbable to appear in any hash dictionary or rainbow table. Take a look at the following two passwords. Which do you think is more likely to appear in someone’s list?

Passw0rd
d32c00e3c7935e76da471babeda400c902903ee0Passw0rd

And what if three people in the database are all using the password Passw0rd?

The SHA-1 hash values of the following passwords

4e4940aa6f9df8148aafa6ed458d583091b6c162Passw0rd
79564deffc86af62e08f90d5cc432880357f5773Passw0rd
b5c4f7b8919242ed4316e302deb50748eb235812Passw0rd

Result in:

2a22aeaf4e0a6f10f475effa1256feff3b39b328
23da0509aaebb5b0e5911f9e1a5fb624f3bc6b0a
7e130f3cbb36cb86cd3caef24c554605c363012f

It’s not obvious that all three of those users are using the same password. Salting is essential to password security and every website should use it. Salts are usually stored in the database, and while this may seem detrimental, keep in mind that for an attacker to break the hash, they would have to apply the salt to every single entry in their dictionary, which would take an enormous amount of time — per password! Also don’t forget that the attacker has no way of knowing (unless they are able to get a hold of the application’s source code, which has happened before) whether the salt is appended to the front, end, or something else entirely before being hashed. Even if they did know, the amount of time required to break even a single password becomes infeasible.

Conclusion

I hope password hashing and storage has been explained simply enough for everyone to understand, and I hope after reading this post, you reevaluate your passwords. One other thing to keep in mind: hashing algorithms can handle any kind of string input, and do not require limitations; websites that impose arbitrary restrictions on your passwords should set off red flags because it’s an indicator that they’re storing the passwords in plain text. Think twice about the passwords you use for these websites and services.

Posted in Security

The Price of Security

July 15, 2012

There has been quite a few data breaches in the last month or so. LinkedIn, eHarmony, and Last.fm were all breached in early June and had user credentials leaked. Just a few days ago Yahoo!, Billabong, Formspring, Phandroid, and Nvidia were all breached as well with user information including passwords leaked. These breaches vary in severity; some of these sites were storing their passwords in plain-text (why are companies still doing this? Haven’t we learned from the Sony debacle?) and others were hashed. Storing passwords in plain-text is the biggest (and simplest to avoid) security mistake a company can make. Hashing the passwords is definitely better, but they’re not infallible as there are sites out there with massive hash dictionaries.

I don’t have much else to say about these breaches except that if you have an account with one of the above websites, I suggest you change your password immediately, as well as any other account where you may be using the same user name/email address and password combination (you’re not doing that, though, right?).

I’m no expert when it comes to security, but I like to think about it a lot, especially when I read about these high-profile breaches. I also like to think I’ve learned quite a bit over my years of programming that I could make a pretty secure web application. It may not be impenetrable, but it would be enough to repel opportunists. And that’s really what it’s about, isn’t it? Preventing “crimes of opportunity”?

I liken web and computer security in general to the security of a home, and what the price of it is. The price of security is convenience. Imagine your home with little to no security: you leave your doors unlocked or open, or heck, you may not even have a door! Anyone could come and go as they please. It’s very insecure but also very convenient. You never have to worry about forgetting your keys, or losing them, or locking yourself out. It’s a dream! Ever come home and your arms are full of groceries and you’re struggling to get your keys and open your front door? No need for that, just walk in!

Of course, no one lives like that, so we put some security on our homes: we have doors with locks, we have garage doors with keypads, we have locked windows. Some homes even have security alarms or security monitoring. Others may even own a dog or two (though I hope they have the dog for companionship rather than for security).

Pretty standard security. But houses get broken into all the time. It’s a very common occurrence. Imagine upping your home’s security. Imagine every room in your home had a locked door. An intruder could break in but he would be very limited on what he could do. Your house is very secure. On the flip side though it’s incredibly inconvenient. Every time you want to go to another room you have to bring a key. What if you lose a key? Uh oh.

So instead we choose to have a certain level of security because too much of it is too inconvenient. The same applies to web applications. It applies to both the user and the developer. A huge amount of forethought has to go into designing, testing, debugging, and fixing a secure system (as well as maintaining, and hopefully documenting). Users have to be able to use the system easily enough so they don’t become frustrated and leave. Imagine having to enter your user name and password every time you wanted to access a page of a site, or having different passwords for different areas. It would be secure because an attacker would need to know the specific password to the specific area he wants to access, or try to get them all. But it would be a huge pain to actually use.

In the end we have to decide what is an acceptable amount of inconvenience. I’m not sure there’s a definite answer to that, but we should strive to find the middle ground between too much security and too much convenience.

Posted in Security