Security Tip: Do You Really Need a Hash for That?

[Tip#65] Before you reach for a hashing function, stop and think about what you're hashing and why you're hashing it...

Security Tip: Do You Really Need a Hash for That?

I frequently come across code that looks like this in my audits:

$model = Model::create(...);
$model->hash = md5($model->id);
$model->save();

return $model->hash;

A new model is created, and then a new hash is generated, based off some aspect of the model data. This is often done from the incremental ID itself, but it’s also common to use user submitted data like names, filenames, email addresses, etc, or even just a timestamp. This hash is almost always generated using md5() or sha1(), and sometimes a form of “randomness” added, such as with time() or rand(). The hash is then returned and given to the user in some way - usually via a filename or URL key.

It’s easy to see the problem the developer is trying to solve: How do I generate an unguessable unique key for this model?

I can see the appeal of using md5() or sha1() here. They are both incredibly fast and produce a short output which looks unique and appears unguessable, so it’s easy to reach for those functions in this situation. Especially if you’re not familiar with newer hashing algorithms, such as SHA256 or xxHash. However, if you’re passing the hash back to the user, they are horribly insecure. In fact, even using a secure hashing algorithm like SHA256 on it’s own in this scenario should be avoided. If your inputs are predictable (in this case, an incremental ID), then the output can be easily brute-forced.

So what’s the solution? It depends…

Generating an unguessable unique token?

If you just need a unguessable unique token which is generated once and reused, without needing regeneration or verification to prevent changes, then there is no reason to hash anything. Instead, there are a few methods you can use to generate a secure unguessable unique random string:

  1. Str::random(32) → My personal favourite. It’s completely unguessable, and unique enough for virtually all situations.
  2. Str::uuid() → A popular choice. Generates a UUID (v4) that is unguessable and unique.

For special cases, there are some potentially less-secure options too:

  1. Str::ulid() → A short time-orderable unique token, which uses a microseconds and secure randomness to produce tokens that can be sorted without revealing sensitive information like an incremental ID.
  2. HashIDs → Masks incremental IDs behind a short hash, which can be decrypted to extract the original ID. These are useful if you can’t store the hash, but need to mask an ID.
💡
I’ve called these “potentially less-secure” because they include significantly less randomness (ULID) and/or are much shorter (HashIDs), which can make them easier to predict. This would still require significant effort to exploit though, so they would be fine in many cases where you just need an unidentifiable token. But they shouldn’t be relied upon to protect routes, etc.

Generating verifiable signatures or repeatable hashes?

This tip is already far too long for a weekly tip, so head over to Use HMAC Hashes To Verify Data for the next bit.