Using HMACs instead of Plain Hashes for Security

We all know that everyone should be storing passwords as cryptographic hashes (SHA256, for example), these days. The Internet has now been around too long for people to not know this. In addition, engineers that write communications code will often employ “message authentication codes” (“MACs”). These are hashes calculated over data that is passed back and forth from the remote system, and can be used to verify that various data hasn’t been modified (preventing “man in the middle” attacks).

However, it is much lesser known that hashes used in the manner that they often are widely vulnerable to attack. For example (in Python):

from hashlib import sha256

salt = b"3qk4yfbgql"
message = b"some data"
salted_data = salt + message
digest_ = sha256(salted_data).hexdigest()

This is no good. Our hash functions are referred to as “iterative hash functions”. In other words, they calculate the hash by iterating, one chunk at a time, through the cleartext. They suffer from what Bruce Schneier refers to as the “length extension bug”. In other words, it’s trivial for a third-party to append data to the message, even though it’s salted, and calculate another, completely valid, MAC. If you move the salt to the end of the message, there are different problems.

Enter the HMAC (“keyed hashed message authentication code”). An HMAC tool will take the salt, data, and a hash function, and do this: hash(salt + hash(salt + message))

import hmac
from hashlib import sha256

salt = b"3qk4yfbgql"
message = b"some data"
hmac_ = hmac.new(salt, message, sha256)
digest_ = hmac_.hexdigest()

Problem solved (for now).