PHDays 9: AI CTF Parsing
The topic of machine learning security has been quite hype lately and I wanted to touch on its practical side. And then the cool reason is PHDays , where a variety of experts from the world of information security gather and there is an opportunity to draw attention to this topic.
In general, we did a task-based CTF, with tasks affecting part of the security risks of using machine learning techniques.
Our competition lasted a little more than a day. It was understood that it is individual - teams of one person. I wanted people to take part in the conference to get to know each other personally. Therefore, the tasks should be solved in a couple of hours, not require a lot of computing resources, but there should be difficult tasks too - not everyone should win: D
As a result, we had 6 tasks (the seventh was just fun), it seems that for one person on a day is enough. The tasks themselves, unfortunately, are no longer available. But maybe after reading the analysis, you want to participate next time?
I would like to express my deep gratitude to the guys without whom this CTF would not have taken place: @groke and @mostobriv. The coolest ideas, technical solutions and a deployment party on the night before the start - what could be more beautiful when it is in a terrific company ?! :)
tiny.cc/6fj06y
Dan dataset of 3 391 pictures of cats and dogs.
The quest was marked as “Stegano”. Quilted jobs involve hiding some information. It seemed easy to guess that cats and dogs are something binary. After a little thought, we can assume that this sequence of cats and dogs can be some kind of binary message. Suppose that the seals will be 1, and the dogs - 0. If suddenly it does not work out, you can simply swap them. Next, we find a trained model that classifies cats and dogs. There are many examples of lessons on the classification of cats and dogs, as well as trained models after them - you can find trained models on the github. We take a trained model, in extreme cases, we train ourselves. We predict each image as 0 or 1. And this sequence of “bytes” is translated into a string.
We get the text that contains the flag `AICTF {533m5_y0u_und3r574nd_4n1m4l5}`.
However, for some reason, several participants at different times tried to pass a strange flag with the word “Adopted”. We don’t know where they got it from, if suddenly the participants explain, it will be cool: D
The service was a kind of “blog”, where each user could leave public and private entries. Since the functionality was small - it was not difficult to guess that you need to somehow get a private record.
There was actually only one input field - the record id.
What to do?
The first thing that comes to mind to a security guard is to try sql injections. However, it is said that the service is protected by AI. And I couldn’t send a simple sql-injection. The service responded to such an attack with “Hacking attempt!” Many tried to pass it like a flag, but really thought that everything is so simple?
Under the hood of the test was an LSTM network that analyzed id for sql-injection. However, the input to the LSTM must be a fixed length. For simplicity, we limited it to 20 characters. That is, the logic was this: we take the request, if it is more than 20 characters - we cut it off and check the rest, if less, then add 0.
Actually, therefore, simple sql-injection did not work right away.
However, there was a chance to find a vector that the network would not see and take for a good request.
It was necessary to recognize the QR-code:
Files to task are available here.
Several encrypted files were given. Among them was a pyc file, by reversing which it was obvious that there was a function by the code of which it was possible to understand that all the necessary files were AES encrypted on the key, which was received from the bytecode of this function and another one inside it.
There were two possible solutions: parse the pyc-file and get the implementation of the functions, or make your own hashlib proxy module, which would output its argument and run it, you could get a key, then decrypt the files and run the QR-Reader, which recognized the proposed picture as a flag.
The service was a kind of competition like on kaggle. It was possible to register, download data and upload models, they were tested on private data and the result was recorded on a scorboard.
And the goal seems obvious - gain 1.0 accuracy.
Was it difficult? Impossible: D
Data was randomly generated and, of course, it was implied that such accuracy had to be gained in some other way. The service accepted models in .pickle format. And it seems that everyone already knows, but it turns out that not everyone can get RCE through pickle , but what could be worse?
Actually, this had to be done! Having obtained remote access to the server, it was possible to download the data on which the solution was tested, retrain the model and get 1.0 accuracy and with it the flag.
As the name suggests, the service does something with images.
A stunning application interface suggested uploading a photo.
In response, an image with a changed style and competition logo was sent to you.
Where is the flag here?
It seems quite common to encounter commonplace vulnerabilities on CTF - this time it was Image Tragick . However, few guessed or not everyone who tried it was exploited.
This task turned out to be a cherry on the cake and an unsolved problem. Although after talking with the participants, it turned out that they were very close to the answer.
Files for task can be viewed here.
The system receives python bytecode and executes it at home. But, of course, she won’t just do it, because there is an “AI”. He checks the version of python and does not allow the "wrong". If the code passed the test, it was launched on the server - which means that you could get a lot of information.
The bytecode that the interpreter gives could be diluted in bits, and the neural network that checked would miss (it was also LSTM), or you could add a bunch of garbage at the end.
Further, when you know how to execute Python code, it was possible to detect the `flag_reader` binar on the server, which was launched from the root. The binar had a format string vulnerability through which the flag could be read.
Nikita’s solution (konodyuk) can also be found here.
By the end of the competition, 130 registered, 14 passed at least one flag, and 5 of 6 tasks were solved - that means we managed to balance the complex and easy tasks.
Considering that we did not disseminate the information very much, as we did for the first time and would not be ready for a heavy load, we still consider the super successful competition to be.
Prizes won:
The winners were awarded at the end of the second day of PHDays with honors and cool prizes: AWS DeepLens, Coral Dev Board and a backpack with the conference logo.
The guys who usually play classic CTF and are now fond of machine learning have rated our contest, so we hope next time will be joined by datacentists who are interested in security.
In general, we did a task-based CTF, with tasks affecting part of the security risks of using machine learning techniques.
What is CTF ???
Capture The Flag (CTF) is a very popular computer security competition (in popularity as a kaggle competition for datacentists). There are two formats: task (jeorpady) and service (attack-defense). We did task.
Classical Task Competitions resemble the format of “Your Game”. When there is a set of tasks of different categories that have different costs.
The traditional categories in CTF are: web - web vulnerabilities, reverse - reverse engineering, crypto - cryptography, stegano - steganography, pwn - binary exploitation.
Teams (from 1 to n people) solve tasks and whoever solves tasks for a higher number of points is a fine fellow.
Classical Task Competitions resemble the format of “Your Game”. When there is a set of tasks of different categories that have different costs.
The traditional categories in CTF are: web - web vulnerabilities, reverse - reverse engineering, crypto - cryptography, stegano - steganography, pwn - binary exploitation.
Teams (from 1 to n people) solve tasks and whoever solves tasks for a higher number of points is a fine fellow.
Our competition lasted a little more than a day. It was understood that it is individual - teams of one person. I wanted people to take part in the conference to get to know each other personally. Therefore, the tasks should be solved in a couple of hours, not require a lot of computing resources, but there should be difficult tasks too - not everyone should win: D
As a result, we had 6 tasks (the seventh was just fun), it seems that for one person on a day is enough. The tasks themselves, unfortunately, are no longer available. But maybe after reading the analysis, you want to participate next time?
I would like to express my deep gratitude to the guys without whom this CTF would not have taken place: @groke and @mostobriv. The coolest ideas, technical solutions and a deployment party on the night before the start - what could be more beautiful when it is in a terrific company ?! :)
Stegano: Aww - 100
tiny.cc/6fj06y
Dan dataset of 3 391 pictures of cats and dogs.
The quest was marked as “Stegano”. Quilted jobs involve hiding some information. It seemed easy to guess that cats and dogs are something binary. After a little thought, we can assume that this sequence of cats and dogs can be some kind of binary message. Suppose that the seals will be 1, and the dogs - 0. If suddenly it does not work out, you can simply swap them. Next, we find a trained model that classifies cats and dogs. There are many examples of lessons on the classification of cats and dogs, as well as trained models after them - you can find trained models on the github. We take a trained model, in extreme cases, we train ourselves. We predict each image as 0 or 1. And this sequence of “bytes” is translated into a string.
You can see the author’s solution here
We get the text that contains the flag `AICTF {533m5_y0u_und3r574nd_4n1m4l5}`.
However, for some reason, several participants at different times tried to pass a strange flag with the word “Adopted”. We don’t know where they got it from, if suddenly the participants explain, it will be cool: D
Notes
The service was a kind of “blog”, where each user could leave public and private entries. Since the functionality was small - it was not difficult to guess that you need to somehow get a private record.
There was actually only one input field - the record id.
What to do?
The first thing that comes to mind to a security guard is to try sql injections. However, it is said that the service is protected by AI. And I couldn’t send a simple sql-injection. The service responded to such an attack with “Hacking attempt!” Many tried to pass it like a flag, but really thought that everything is so simple?
Under the hood of the test was an LSTM network that analyzed id for sql-injection. However, the input to the LSTM must be a fixed length. For simplicity, we limited it to 20 characters. That is, the logic was this: we take the request, if it is more than 20 characters - we cut it off and check the rest, if less, then add 0.
Actually, therefore, simple sql-injection did not work right away.
However, there was a chance to find a vector that the network would not see and take for a good request.
New Edge QR reader
It was necessary to recognize the QR-code:
Files to task are available here.
Several encrypted files were given. Among them was a pyc file, by reversing which it was obvious that there was a function by the code of which it was possible to understand that all the necessary files were AES encrypted on the key, which was received from the bytecode of this function and another one inside it.
There were two possible solutions: parse the pyc-file and get the implementation of the functions, or make your own hashlib proxy module, which would output its argument and run it, you could get a key, then decrypt the files and run the QR-Reader, which recognized the proposed picture as a flag.
A detailed decision of the participant who took 3rd place can be found here:
Prediction challenge
The service was a kind of competition like on kaggle. It was possible to register, download data and upload models, they were tested on private data and the result was recorded on a scorboard.
And the goal seems obvious - gain 1.0 accuracy.
Was it difficult? Impossible: D
Data was randomly generated and, of course, it was implied that such accuracy had to be gained in some other way. The service accepted models in .pickle format. And it seems that everyone already knows, but it turns out that not everyone can get RCE through pickle , but what could be worse?
Nikita's decision (konodyuk)
Actually, this had to be done! Having obtained remote access to the server, it was possible to download the data on which the solution was tested, retrain the model and get 1.0 accuracy and with it the flag.
Photogram
As the name suggests, the service does something with images.
A stunning application interface suggested uploading a photo.
In response, an image with a changed style and competition logo was sent to you.
Where is the flag here?
It seems quite common to encounter commonplace vulnerabilities on CTF - this time it was Image Tragick . However, few guessed or not everyone who tried it was exploited.
New age antivirus
This task turned out to be a cherry on the cake and an unsolved problem. Although after talking with the participants, it turned out that they were very close to the answer.
Files for task can be viewed here.
The system receives python bytecode and executes it at home. But, of course, she won’t just do it, because there is an “AI”. He checks the version of python and does not allow the "wrong". If the code passed the test, it was launched on the server - which means that you could get a lot of information.
The bytecode that the interpreter gives could be diluted in bits, and the neural network that checked would miss (it was also LSTM), or you could add a bunch of garbage at the end.
Further, when you know how to execute Python code, it was possible to detect the `flag_reader` binar on the server, which was launched from the root. The binar had a format string vulnerability through which the flag could be read.
Nikita’s solution (konodyuk) can also be found here.
Summary
By the end of the competition, 130 registered, 14 passed at least one flag, and 5 of 6 tasks were solved - that means we managed to balance the complex and easy tasks.
Considering that we did not disseminate the information very much, as we did for the first time and would not be ready for a heavy load, we still consider the super successful competition to be.
Prizes won:
- 1st place - silent
- 2nd place - kurmur
- 3rd place - konodyuk
The winners were awarded at the end of the second day of PHDays with honors and cool prizes: AWS DeepLens, Coral Dev Board and a backpack with the conference logo.
The guys who usually play classic CTF and are now fond of machine learning have rated our contest, so we hope next time will be joined by datacentists who are interested in security.