Security Tip: Be Careful Of Transliteration
[Tip#15] Because we don't already have enough to worry about, without also needing to factor in other characters and emoji too...
Greetings everyone, and a big welcome to all the new subscribers who are joining us after Laracon Online! Your support means so much. I started Laravel Security in Depth to try and do something a bit different, and it has exceeded my expectations and grown beyond what I’d originally planned. When I launched it was just emails, and now we have interactive demos, security discussions, and I’m aiming to have a full video demo coming in the next week in our In Depth, covering timing attacks!
If you haven’t subscribed to the paid version yet, now is your chance: I currently have a 25% off special offer running for Laracon Online! Upgrade now to lock in your discounted price forever.
This week we’re talking about Transliteration. If that means nothing to you, don’t worry, it meant nothing to me until a follower pointed me towards the Throttle/Rate Limit Bypass Exploit that was recently fixed in Laravel Fortify. This exploit caught my eye because it uses transliteration to create unique combinations of usernames to bypass the rate limit, making it easier to conduct brute-force attacks. Basically, it’s my kind of sneaky! 😎
Be Careful Of Transliteration
Transliteration is the method of converting text from one set of characters to another, in a predictable way. Usually with the goal of maintaining readability
in the resulting string while adding decorations, or tricking readers into thinking the resulting string is the original string.For example:
One Ring to rule them all
Can be represented as:
Ⓞⓝⓔ Ⓡⓘⓝⓖ ⓣⓞ ⓡⓤⓛⓔ ⓣⓗⓔⓜ ⓐⓛⓛ
While this can look cool, there is a problem with it…
It turns out that MySQL automatically translates transliteration characters back into their originals when it performs a query. So if you performed a search for "Ⓕⓡⓞⓓⓞ", it would actually search for "Frodo
".
I honestly had no idea this was a thing, so I tested this on our SQLi demo and it works!
The fix for this is fairly trivial: transliterate the string back to a basic character set first. The PHP will then see the same thing as MySQL, allowing the rate limiter to properly catch it.
A PR was merged into Laravel 8, which added the Str::transliterate()
helper, that you can use in situations where this is a problem to translate the characters back to their originals:
>>> Str::transliterate("Ⓕⓡⓞⓓⓞ");
=> "Frodo"
(Internally it uses the Portable ASCII package to perform the translations.)
So you’re probably thinking: “This is a cool exploit, but why do we need to know about it?”.
This is a clever bypass method that allows you to evade things like rate limiting, blocklists, content restrictions, and existence checks, as well as trick victims into miss-reading a string by using similar shaped characters.
Think about it, do you have anything like that in your apps where PHP performs a check against a string and the value is then passed into a MySQL query? Or where user-inputted strings are displayed to other users?
Even ignoring the MySQL automatic behaviour that makes the exploit work, there are many cases where you’d want to swap out special characters in your apps. Content moderation comes to mind immediately!
So keep Str::transliterate()
in mind when you’re handling user input, and check things like rate limiting to ensure you’re not vulnerable.
If you missed my Laracon Online talk, you can find it here: https://www.youtube.com/watch?v=0Rq-yHAwYjQ&t=34273s. In the talk I hack into the server behind the webapp and see what sort of fun I can get up to. To give you an idea, I left a number of people quite scared with my finale hack (or at least the idea of it, I didn’t actually do it, for reasons I explain it the talk.)
Also, my talk from last year’s Laracon is here: https://www.youtube.com/watch?v=bFMBHRlEOjo. This is my original hacking talk, where I have into a Laravel app live, showing various vulnerabilities.
It’s important to note that “readability” usually doesn’t extend to blind and vision impaired people, in many cases due to the characters being less clear or screen readers being unable to translate.
Very common on Twitter in display names.
It’s common to generate phishing links and other scams with URLs that feature alternate characters that are easily confused.