May 6, 2021
What the Heck are CAPTCHAs and How Do They Work?
Posted by Rhiannon
Are you a robot? Some days, the internet sure seems to think so, and it acts accordingly by making you prove your humanity. We’ve all come across the seemingly silly tests asking us to select any picture with a stop sign in it, or having us check a box with an implied pinky promise that we’re not robots. These tests, as you may know, are called CAPTCHAs. The rather “captchy” acronym is short for “completely automated public Turing test to tell computers and humans apart.” As the name suggests, they’re used to differentiate between humans and bots online. But why are they used, how do they work, and what are their drawbacks?
- The 5 Ws of CAPTCHAs
- Types of tests
- CAPTCHA threats
The 5 Ws of CAPTCHAs
What are CAPTCHAs?
As we’ve said before, CAPTCHAs are tests that verify if a user is a human or a robot.
Where Do You Find Them?
These tests exist on the internet and can be found on any website.
Who Creates Them?
CAPTCHAs are, ironically enough, generated automatically by robots. These bots are employed by CAPTCHA creating companies, such as Google.
When Are They Used?
Some websites use CAPTCHAs every time a specific event is triggered by a user. For example, an ecommerce website may use one of these tests every time someone proceeds to a checkout page. In other cases, some websites use CAPTCHAs trained to detect certain activities that may seem suspicious (like clicking links too quickly). When these activities occur, a CAPTCHA is generated.
Why Do Websites Use Them?
Most people understand that they’re used to prevent bots from accessing the web, but are in the dark as to why that’s necessary. There are a few reasons.
For example, you want to buy tickets to a concert, but find that it’s all sold out. Then a scalper comes along and starts selling hundreds of tickets, but for triple the price. Did the scalper really sit there and buy batches of tickets by hand? Of course not. Instead, they used a fleet of bots to snatch up as many tickets as possible. By forcing a user to complete a CAPTCHA, websites can prevent this from happening.
In other cases, websites want to prevent malicious bots from visiting their pages for the purposes of data scraping, fake account creation, or similar activities.
How Do They Know Who’s Human and Who Isn’t?
In many cases, CAPTCHAs use context as a method for determining if a user is human or not. Because of the learned experiences we gather throughout the course of our lives, humans are great at intuitively interpreting media in context, something that bots struggle with.
For example, you can tell a bot that a stop sign is a red, octagonal object with the word “Stop” printed on it in white. Yet, if you show a bot a picture of a stop sign, it may not recognize it in the context of the photo. Things like lighting, angles, and other objects in the image can all confuse bots. Meanwhile, if you show the same photo to a human, they’ll be able to immediately recognize the stop sign because we intuitively understand context and can filter out extraneous data.
In other cases, CAPTCHAs may analyze user behaviour; in general, humans and bots act very differently, which tells us apart. We’ll explain this more later.
Different Types of CAPTCHAs
Every CAPTCHA is a test designed to tell the difference between a human and a bot. But they don’t all work the same way. This helps different websites accomplish different things, and also reduces the odds that bots can be taught how to beat every type of CAPTCHA.
Recognition Tests
The earliest CAPTCHAs asked users to type out a specific word, which was often written in a difficult-to-read font. Because humans are used to reading different fonts and can process difficult images better than a robot can, it was a good start to bot security. However, bots have come a long way since then and many are now able to fool these types of CAPTCHAs.
Because of this, recognition tests have evolved. Instead of asking humans to recognize and type a garbled word, many websites ask them to pick out a series of photos with familiar objects in them, like dogs, cars, stop lights, and more. At the moment, bots aren’t equipped to handle the variety of possibilities that come from these types of CAPTCHAs.
Timed Forms
Some tests may ask users to fill out a simple form. In some cases, you may not even recognize this as a type of CAPTCHA because the forms can be something as mundane as your shipping address for a product you want to buy, or personal details for a new account you’re creating. This is great for humans, because it doesn’t interrupt your work flow, but less great for bots. Because bots fill out forms instantly while humans do not, this is a red flag to either send them another test, or to ban an account until it can be investigated.
Social Sign Ins
Many websites have started allowing users to sign up to them with existing Facebook, Google, or Instagram accounts. While most users think this is just a matter of convenience (or, perhaps, surveillance), this method of account creation also doubles as CAPTCHA. Because bots probably don’t have social media accounts, they have to sign up the old way, by inputting their information manually. Of course, regular humans may choose to do this too (many people don’t want their public accounts linked to other websites or avoid social media for privacy reasons), and may get hit with another CAPTCHA test afterwards, but this will only delay you from accessing your new account.
Honeypots
Some websites don’t actually need you to prove that you’re human, anymore. Instead, they simply fool bots into revealing themselves by hiding CAPTCHAs on their sites. The test won’t be visible to a human, but a bot (which reads the code of a website) will see this test and attempt to complete it, which sends up a red flag. Unfortunately, not all bots are fooled by this test anymore so it’s not the most effective way to protect against them.
reCAPTCHA v2 and reCAPTCHA v3
A few years ago, Google got into the CAPTCHA game by offering two types of secure bot tests. In addition to being secure, they also reduce the amount of involvement (and thus hassle) needed from real humans.
The first option, reCAPTCHA v2 is the one almost everyone knows, where a website simply asks you to check a box confirming you are not, in fact, a robot. The thing that people don’t know is why these types of tests are successful; surely any bot can click a check box, right? The answer is yes, they can. However, Google doesn’t care that they can check a box. Instead, they care how. This type of CAPTCHA analyzes cursor movements. Humans always exhibit unpredictable behaviour when moving a cursor. We simply can’t make a perfectly straight line with a mouse, no matter how hard we try. A robot, meanwhile, will have a completely predictable cursor pattern, which singles them out as a bot.
The other option, meanwhile, reCAPTCHA v3, is one most people don’t even realize exists because it requires no action whatsoever. Instead, Google scans your device for a specific cookie (typically the one that prevents you from having to sign in to your Google account again every time you open a new tab). Generally, bots don’t have Google accounts, so if this cookie isn’t on a device, it creates a red flag.
CAPTCHA Threats
So far, CAPTCHAs have been a reasonably reliable, if imperfect, method of detecting and blocking bots online. But, while stopping some online threats, they inadvertently create others.
First, as CAPTCHAs progress, they’re inadvertently marching us towards their own obsolescence. In addition to discovering bots, these tests are also often used to improve AI learning. When a human completes a CAPTCHA, that data is given to artificial intelligence softwares as a method of teaching them how to learn and adapt. Eventually, it’s possible that AI will learn so much from this data that it can reliably solve any CAPTCHA, making them ineffective.
Second, especially in the case of Google CAPTCHAs, they may have some unexpected effects on your privacy. Whether they’re analyzing your cursor behaviour or scanning your device for cookies, the information that can be gleaned about you from these activities can actually be used to recognize you online wherever you go. This news is hardly surprising, since it’s well known that Google wants user data. However, it still comes as another blow to online privacy.
Unfortunately, little can be done about these breaches until CAPTCHAs are replaced with something better.
Love them or hate them, until new methods for banishing bots come to light, CAPTCHAs are here to stay.
Posted by Rhiannon
More Blog Posts
February 14, 2023
How the Investigatory Powers Act Impacts Citizen Privacy
In 2016, the United Kingdom passed the Investigatory Powers Act or IP Act, into law. This act empowered the government and related agencies to access and collect citizen data, without consent. Critics immediately slammed the new law. The media dubbed it the “Snoopers’ Charter.” Meanwhile, Edward Snowden described the act as “the most extreme surveillance […] Read moreFebruary 14, 2023
How to Easily Unblock Wikipedia with HotBot VPN
Wikipedia puts a wealth of information at your fingertips. Everything from the biography of Alexander Graham Bell to the basics of quantum computing can be instantly opened by curious browsers. But what happens when you can’t access that information? Whether a business network blocks it or a particular country censors it, don’t let that slow […] Read moreFebruary 14, 2023