Security Tip: Don't Log Sensitive Data!

[Tip#61] While it's tempting to throw everything into logs, keep in mind where your logs end up → plain text files, 3rd party collectors, passed around the development team, etc...

Security Tip: Don't Log Sensitive Data!

Last week we talked about hiding sensitive parameters from stack traces, to prevent sensitive data from ending up in logs or displayed in error/debug messages. Following on from that, the next logical step is to look at other ways sensitive data can end up in insecure locations. Specifically, sensitive data ending up in application logs.

I’m a big fan of verbose logging. I always turn on access logs in Forge, and my code is often full of log messages. It comes from when I used to clean infected WordPress sites - having verbose logs makes tracking malicious events and finding vulnerabilities soo much easier! It’s also really helpful for debugging weird issues, and generally monitoring user behaviour. So my recommendation is to verbosely log events within your apps.

However, if your logging is too verbose, it opens the door to sensitive data finding it’s way into your logs, and while it’s easy to dismiss the impact of logging sensitive data, you need to consider what happens with your logs:

  • Do you have a third-party log collector?
  • Do you download logs to dev environments for analysis?
  • Does your server management tool let you view logs?
  • How long are your logs kept around for? What’s the backup procedure?
  • Who on your team can view logs?

The “sensitive data” I’m talking about here can be anything from Personally Identifiable Information (PII), Protected health information (PHI), Credit Card details, API keys, passwords, etc. Basically any piece of data that would have some impact if it was obtained by someone who shouldn’t have access to it.

Normally we store this sensitive data within our database or configuration files, but by logging it, we’re pushing that into plain text files, and sending it to third party systems. It’s no longer protected and much easier for someone to obtain! Third parties can be breached, developer machines can be compromised, those who shouldn’t have access to sensitive data can obtain log files “because they aren’t sensitive”, log entries can be shared for debugging… there are many possibilities.

Consider the following code:

Log::info(
    'Showing the user profile for user: {id}', 
    ['user' => $user]
);

It looks like a harmless log call, something many of us would have written, but take a look at the output:

[2023-11-03 21:02:18] local.INFO: Showing the user profile for user: {id} {"user":{"App\\Models\\Challenge\\User":{"id":1,"name":"Frodo Baggins","race":"hobbit","email":"frodo@example.com","email_verified_at":null,"password":"83850fc513742a3eb2e66433cf4123c2","code":"ring","remember_token":null,"created_at":"2023-06-03T12:12:54.000000Z","updated_at":"2023-06-03T12:12:54.000000Z","date_of_birth":"2968-09-22"}}}

These two immediately jump out at me:

"password":"83850fc513742a3eb2e66433cf4123c2"
"date_of_birth":"2968-09-22"

There is also code which may be sensitive, and email, name, and race, are PII too!

If someone gets into this log file, they don’t need to bother trying to get to your database - it contains a list of passwords needed to start cracking your app!

Instead, I would do this:

Log::info(
    'Showing the user profile for user: {id}', 
    ['id' => $user->id]
);

It still provides the same information, without the sensitive data. You can easily look up the user by IP within your admin tools, while anyone outside your team who gains access to the logs won’t able to obtain this information.

My Recommendations

  1. Don’t send entire models into the logger as context.
  2. Reference records with IDs rather than names/emails/sensitive data.
  3. If you need to log sensitive data, anonymise it so unique values can be identified without the original values being stored in logs.
  4. Encrypt values that must be logged, so they can be decrypted by the system on-demand.