unCAPTCHA: using Google services to bypass Google reCAPTCHA
unCAPTCHA is an automated system developed by experts from the University of Maryland that can bypass Google’s reCAPTCHA with an accuracy of 85%. They succeeded thanks to the recognition of the audio version of the tooltip for people with disabilities.
The method uses a vulnerability in the sound version of reCAPTCHA - it says a numerical code, which then must be entered into the verification field. The algorithm uses several services that help determine the numbers - including the Google Cloud Speech Recognition service.
Researchers have posted the code for their project on GitHub . UnCAPTCHA uses speech recognition tools such as Bing Speech Recognition, IBM, Google Cloud, Google Speech Recognition, Sphinx and Wit-AI.
Principle of operation
The audio command format is a series of numbers of various lengths pronounced at different speeds, accents and through background noise. To attack this captcha, sounds are identified and automatically broken into parts.
Each bit of the audio signal of each number is loaded into 6 different free online audio transcription services (IBM, Google Cloud, Google Recognition, Sphinx, Wit-AI, Bing Speech Recognition), and these results are aggregated. After combining, the most likely string is detected heuristically. After this, the numbers are sequentially typed in captcha. During testing, an accuracy of 92% for individual numbers and up to 85% in the recognition of audio commands in full was observed.
unCAPTCHA is not the first system of its kind. In March of this year there was information about an attack usingReBreakCaptcha , a system almost identical to unCAPTCHA.
Video demonstration of work
Tests show that unCAPTCHA can solve 450 reCAPTCHA problems with 85.15% accuracy in 5.42 seconds. This is less than what a person needs to listen to one reCAPTCHA sound file.
unCAPTCHA
The project code is written in python using the popular selenium library and FFmpeg - a set of open source libraries that allow you to record, convert and transmit digital audio signals.
→ The source code is published on github .
The link is available research from the creators of the utility.
The developers notified their research of Google experts, as a result of which new measures have been added to protect against such attacks.