Bitsquatting Is Alive and Well
Reading the recently released Cisco Annual Security Report, I paused on page 50 where the author talks about bitsquatting. It seems this topic continues to ruminate in cybersecurity circles, recalling this paper from Black Hat in 2011. So we spent a few hours yesterday poking around to examine some registration patterns related to bitsquatting.
A bit, or binary unit, is how computers communicate. Computers aren’t perfect, and bit data can get corrupted within systems or across communications. Which means it can get corrupted in DNS resolution. One bit error in the communication of a DNS request and the response will come back different. Every letter of the alphabet can get corrupted in up to 8 possible ways, one each for each bit in a letter. Not all of the resulting bit series to which these translate will be alphanumeric, but some are. For example, the letter “t” can error to the number 4. And a “.” (dot) can error to the letter n.
What does this mean? It means that in very rare cases, someone that types in the domain name twitter.com will end up requesting 4witter.com due to bit corruption. And so on. And as that 2011 paper notes, by aggregating a lot of bitsquatted domains on some very popular website names, one can accumulate a material amount of error traffic.
In the screenshot above we ran the domain twitter.com against our bit-toggling tool and indeed nearly all the alphanumeric permutations are already registered, including 4witter.com. The same is true for most of the top Alexa sites.
Obviously, there is some overlap between what might be a bitsquat domain and what might be a fat-finger keyboard proximity typo. If you want a clear example of bitsquatting, look for any domain that starts with wwwn, because as I noted above the dot corrupts to the letter n. So a bitsquat of www.google.com is wwwngoogle.com. That domain is registered, but not by Google.
At some point we’ll put our bit-toggle analysis tool up in our LABS area (reminder, Pro and Enterprise customers can request access to the research tools in LABS that are not publicly available). While not nearly the problem that more overt types of DNS-based trademark abuse is, clients interested in protecting their brand online might want to keep this variant in mind. We’ve done some additional analyses of some very high traffic hostnames and the results are pretty interesting. More on that in another blog post.