Ear, ear: Hacker could defeat Google reCAPTCHA with speech recognition

News by Rene Millman

Google's reCAPTCHA anti-robot widget has been found to be susceptible to a robot attack that leverages its own online services.

A security researcher has managed to defeat one of Google's security measures by using another one of its services.

According to a security researcher who goes by the moniker “East-EE”, there is a “logic vulnerability” within Google's reCAPTCHA field that enabled them to bypass the security using the company's own speech recognition service.

This is not the first time that reCAPTCHA has been defeated by security researchers. In April 2016, researchers managed to defeat the image reCAPTCHA 70 percent of the time using deep-learning technology.

A proof-of-concept script of the hack has been posted on GitHub using the Python programming language which enables an attacker to automatically bypass reCAPTCHA fields used to protect websites from spam and bot traffic.

The researcher said in a blog post that the flaw, dubbed ReBreakCaptcha, works in three stages.

The first step is to get reCAPTCHA to present the hacker with an audio challenge. reCAPTCHA always presents a user with three possible challenges –  an image, text or audio clip. If presented with an image or text challenge, by clicking on the headphone icon or choosing the ‘Reload Challenge' button, reCAPTCHA will generate an Audio challenge which “can be easily bypassed”, they said.

The second stage is recognition: a hacker would then have to convert the audio from reCAPTCHA to a wav file and send this to Google's Speech Recognition API.

“There is a great Python library named ‘SpeechRecognition' for performing speech recognition, with support for several engines and APIs, online and offline,” said East-EE. “We will use this library implementation of Google Speech Recognition API.

“We will send the ‘wav' audio file and the Speech Recognition will send us back the result in a string (e.g. ‘25143'). This result will be the solution to our audio challenge.”

Lastly, for the hack to be successful, the speech recognition result would need to be verified to bypass reCAPTCHA. This involves copying and pasting the output string from Stage 2 into the textbox, and click ‘Verify' on the reCAPTCHA widget.

“That's right, we now semi-automatically used Google's Services to bypass another service of its own,” they said.

They added that a lot of people encounter a harder version of the audio challenge. ”Therefore, I have committed a workaround to the GitHub Repo that should overcome this situation, though at a lower success rate compared to the original easier audio challenges,” said East-EE.

“It is still not fully clear how this harder version is triggered, but the number one reason suspected is when your IP is suspicious to Google.”

The researcher did not state whether Google has been alerted to the issue. SC Media UK asked Google for a comment, but the company had not replied at the time of writing.

Bogdan Botezatu, senior e-threat analyst at Bitdefender, told SC that while image-based CAPTCHA was problematic for hackers, this new technique of piping audio messages to a sound recognition system can be easily used to get the CAPTCHA challenge and defeat the security mechanism.

“Most CAPTCHA security mechanisms have an accessibility method that allows visually-impaired computer users to solve the challenges. It is a matter of time until the next popular CAPTCHA system will be defeated by exploiting the accessibility features,” he said.

David Kennerley, director of threat research at Webroot, told SC that this was quite a simple, innovative way to bypass reCAPTCHA.

“I'm sure the irony of using Google to defeat Google will not be lost on people. While success rates may vary, for the limited amount of effort and time needed to pull off this hack, I'd imagine there's still pay-off even if it doesn't work every time,” he said.

Lee Munson, security researcher at Comparitech.com, told SC that ditching reCAPTCHA altogether “may be a drastic move, given the alternatives are arguably no better, but businesses should keep an eye on server logs and be ready to block individual or ranges of IP addresses or put affected pages into a lockdown state”.

Find this article useful?

Get more great articles like this in your inbox every lunchtime

Video and interviews