Security Tip: Be Careful Of Transliteration

[Tip#15] Since we don't have enough weird edge cases to worry about in security, here's one more: Transliteration allows you to bypass security checks when services like MySQL do magical translation without telling you! 😱

Security Tip: Be Careful Of Transliteration

Transliteration is the method of converting text from one set of characters to another, in a predictable way. Usually with the goal of maintaining readability in the resulting string while adding decorations, or tricking readers into thinking the resulting string is the original string (such as in phishing links).

For example:

One Ring to rule them all

Can be represented as:

Ⓞⓝⓔ β“‡β“˜β“β“– β“£β“ž β“‘β“€β“›β“” β“£β“—β“”β“œ ⓐⓛⓛ
πŸ’‘
It's worth pointing out that while transliterating characters like this can look very cool and are fun to use on social media, they are often not accessible to blind and vision impaired people. Transliterated characters may be difficult or impossible to read, and screen reader software may be unable to understand or translate them back into the original characters.

While they can look cool, they do introduce a fun security problem…

It turns out that MySQL automatically translates transliteration characters back into their originals when it performs a query. So if you performed a search for "β’»β“‘β“žβ““β“ž", it would actually search for "Frodo".

I honestly had no idea this was a thing, so I tested this on my SQLi demo and it works!

Searching for "β’»β“‘β“žβ““β“ž" matches "Frodo".

The fix for this is fairly trivial: transliterate the string back to a basic character set first. The PHP will then see the same thing as MySQL, allowing the rate limiter to properly catch it.

A PR was merged into Laravel 8, which added the Str::transliterate() helper, that you can use in situations where this is a problem to translate the characters back to their originals:

>>> Str::transliterate("β’»β“‘β“žβ““β“ž");
=> "Frodo"

(Internally it uses the Portable ASCII package to perform the translations.)

So you’re probably thinking: β€œThis is a cool exploit, but why do we need to know about it?”.

This is a clever bypass method that allows you to evade things like rate limiting, blocklists, content restrictions, and existence checks, as well as trick victims into miss-reading a string by using similar shaped characters.

Think about it, do you have anything like that in your apps where PHP performs a check against a string and the value is then passed into a MySQL query? Or where user-inputted strings are displayed to other users?

Even ignoring the MySQL automatic behaviour that makes the exploit work, there are many cases where you’d want to swap out special characters in your apps. Content moderation comes to mind immediately!

So keep Str::transliterate() in mind when you’re handling user input, and check things like rate limiting to ensure you’re not vulnerable.


Found this security tip helpful? Don't forget to subscribe to receive new Security Tips each week, and upgrade to a premium subscription to receive monthly In Depth articles, or toss a coin in the tip jar.

Reach out if you're looking for a Laravel Security Audit and Penetration Test or a budget-friendly Security Review, and find me on the various socials through Pinkary. Finally, don't forget to check out Practical Laravel Security, my interactive security course.