If you’ve ever had to listen to a security briefing for a website and its login system, you’ll know that one of the most important things you can do besides hashing your password (the correct way), is to salt it. The fun thing about salts is that they don’t have to be cryptographically secure in origin to provide an extra layer of security (as something is better than nothing). But what are the differences that come up when we attempt to take that step to make them secure, and what steps are appropriate to take?
The Avalanche
One thing all secure hashes have is something called ‘The Avalanche Effect’. Where if you change just one character in your password, the resulting hash will look completely different. For example:
1 2 3 4 5 6 7 | $password = 'abcde'; echo sha1($password); //Presents: 03de6c570bfe24bfc328ccd7ca46b76eadaf4334 $password = 'abcdf'; echo sha1($password); //Presents: 9693da0e085af20ef1f982b017fc6ec2419848e5 |
The two look nothing alike to the human eye, but here’s the problem: Even with the strength attributed to sha1, it’s still really easy for a computer to tell that the two hashes come from a similar source. This is where salting comes in. A salt is simply a known value that is stuck (somewhere) onto a private value to ensure that the resulting hash is even more different than before. This comes into play especially when users have the same password. Due to that fact, we also like to keep salts unique between users so that in case they do have the same password, a hacker who has all the hashes and salts won’t be able to tell. So take, for instance this:
1 2 3 4 5 6 7 | $password = 'aabcde'; echo sha1($password); //Presents: 404940891010bbba961496918826d91fc2e2f5ac $password = 'babcdf'; echo sha1($password); //Presents: 09515cccb9e59871f1ea1a2a34920367a23b4a72 |
Again, to the human eye they look completely different, but the biggest prize from doing such a thing is that now the hacker has to spend even longer with his rainbow table to break the hash. So the idea is, how do we make him spend as much time with it as possible?
Tales from the Crypt
PHP has a wonderful function called “crypt”. If your php install is set up correctly, you’ll be able to use a very popular algorithm called ‘Blowfish’ through the use of this function. Blowfish is a preferred hashing algorithm currently due to the fact that it is slow, and has a strong algorithm behind it, amongst other things. This is a key ingredient when hashing passwords: The slower the algorithm, generally, the stronger. (Please note that something that takes 10 seconds to turn “abc” into “abd” or “efg” into “efh”, etc, etc…. While slow, is still a weak algorithm).
The curiosity about Blowfish is that it’s salt is limited to 22 characters. So how do we use this to get something better? Warning: There is maths ahead.
Math Ahoy!
When we calculate the potential strength of a resulting hash, we start by looking at all the possible outputs it can have from its input. For instance: sha1 provides a result that’s 40 characters long, each character possible being 0-9 or a-f (in total, 16 possible values per character space), so therefore we calculate 16**40 = ~1.5e48 potential combinations. That’s a 15 followed by 47 zeroes. That’s no small number to laugh at. But since we’re only limited to 22 characters in legnth, we have 16**22 = ~3.0e26. Significantly smaller, but no less diminuitive.
So how do we make the sha1 return better? Well one idea is to perform what’s called a base64 encoding. Long story short, it reduces the 40 character output of sha1 to the length of 28, but with 66 possible different character values instead of just 16. This gives us 66**28 = ~8.9e50. WOW! That’s a larger potential combination result than the original sha1 output! Small fact: Blowfish doesn’t enoy two of the output characters from base64 encoding (the ‘+’ and ‘=’ signs), so that reduces us to 64**28 = ~3.7e50; still larger than the original output which then leaves us with 64**22 (due to the salt length limitation) = ~5.4e39. Still lerger than just cutting the sah1 output to 22 characters by a fair set of magnitudes (13, if my algebra isn’t failing me).
So What’s Next?
The biggest problem I have sitting here is that the security of the sha1 algorithm has been drawn into question, thus taking away our eventual goal of ~5.4e39, as the potential for the same outputs coming from different inputs, effectively reduces this number. So much has this been called into question, it brought about the sha2 family of algorithms: SHA256, SHA386, and SHA512. So, to make it even more secure, we can use one of them, to ensure that the 22 characters we gain are more unique and we’re closer to achieving that penultimate ~5.4e39 combinations. There are other algorithms we can use to get good outputs to convert as well, such as the whirlpool algorithm (my favourite) and others, but for the contents of this talk, they’re effectively equivalent.
The math and process is the same as above, but now we’re using a bigger data pool (a larger output), which ensures a smaller occurance of collisions within the first 22 characters than just using base64 encoded sha1. The only other step up we can take is randomly generating those 22 characters ourselves. Under normal circumstances, I would say ‘no’, but since even sending ‘mt_rand()’ through SHA512 has been deemed ineffective, we are foced to look to other sources. Thankfull, php has a way for us.
Tales from the… Mcrypt?
PHP has a function that on the latest stable builds of php for any OS works. It’s called ‘mcrpyt_create_iv’. An IV is called an initialization vector and is used for a lot of secure two-way algorithms. We can use that as the source of our salt. By saying that we want a 16 bit long IV (which translates to 24 characters base64 encoded), we get a knowingly secure source for our salt, ensuring the uniqueness that we’re striving for!
“But wait!” you say, “Why can’t we just throw THAT value into SHA512 and then encode the result?” There’s nothing stopping you from doing that, outside of a lack of need. One thing that a lot of people get confused when we’re talking about a ‘secure salt’, is that the salt itself only has to be unique (or at least, ‘unique enough’ according to a computer). Otherwise who knows that salt doesn’t really matter as long as the correct algorithm is used to hash your passwords. For instance, the Blowfish algorithm actually returns the salt given to it as part of the resulting hah for easy verification for logging in. You store the entire thing in the database. It’s the public half of the hash, so the security of the actual password lies on the password itself before it goes through the hash, and the hashing algorithm itself, right where it should be.
The Code!
And the moment you’ve all been waiting for. It’s a surprisingly small amount of code this time around, but it is extremely vital to keeping your users’ passwords safe. So until next time, enjoy!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | function salt(){ $replace = array( "+" => ".", "=" => "/" ); $salt = mcrypt_create_iv(16, MCRYPT_DEV_URANDOM); //Get 16 bits of truly random data $salt = base64_encode($salt); //Base64 encode it $salt = substr($salt, 0, 22); //Take it down to 22 characters $salt = strtr($salt, $replace); //Replace any naughty characters with safe ones return $salt; } function hash($password){ return crypt($password, "$2y$12$".salt()); //Hash everything } |