Zoo afl phasers
On Habré already appeared a couple of times the article raising the topic of American Fuzzy Lop (AFL) ( 1 , 2 ). But this article will not focus on classic AFL, but on auxiliary utilities for it and its modifications, which, in our opinion, can significantly improve the quality of fuzzing. If you are interested in learning how to pump AFL and look for faster and more vulnerabilities, then welcome under the cat!
What is AFL and why is it so good
AFL - Coverage-guided fuzzer or Feedback-based fuzzer. Learn more about these concepts from a cool paper-like Fuzzing: Art, Science, and Engineering . If we summarize information about AFL, we can say the following:
- Tools the executable file to collect coverage information.
- Mutates the input data so that the coverage is maximum
- Repeats the previous step to find program crashes.
- In practice, proven to be very effective.
- Very easy to use
- In practice, proven to be very effective.
Graphically, this can be represented as follows:
If you do not know what AFL is, then we recommend to start:
- Official project page
- afl-training - a brief excursion into AFL
- afl-demo - a simple demonstration of how to fuzz a C ++ program using AFL
- afl-cve — Collection of vulnerabilities discovered using AFL (not updated since 2017)
- That AFL adds to the program during its assembly, you can read here
- Some useful tips for fuzzing network applications here.
At the time of this writing, the latest version of AFL was version 2.52b . Phazzer is actively developing, and over time, some third-party development included in the main branch of AFL and become in themselves irrelevant. Currently, there are several useful auxiliary tools that can be identified - they are listed in the next section.
Rode0day competition
Отдельно стоит сказать про ежемесячное соревнование Rode0day, где идет соревнование между фазерами кто быстрее и больше найдет уязвимостей в заранее заготовленных корпусах с доступом к исходному коду и без него. И по большому счету представляет из себя противоборство различных модификаций и форков AFL.
However, some AFL users note that the author of phaser Michal Zalewski scored a campaign to support his offspring, since the latest changes date back to November 5, 2017. This is supposedly attributed to his retirement from Google and new projects. In this regard, people began to independently collect and make patches of the latest current version 2.52b.
There are also various options and derivatives from AFL that allow fuzzing Python, Go, Rust, OCaml, GCJ Java, kernel syscalls, or even entire VMs.
AFL for other PL
— python-afl — для Python.
— afl.rs — для фаззинга программы на Rust
— afl-fuzz-js — afl-fuzz для javascript.
— java-afl — AFL фаззинг для Java
— kelinci — еще один фазер для Java со статьей на эту тему
— javan-warty-pig — AFL-like фаззер для JVM.
— afl-swift — для фаззинга программы на swift
— ocamlopt-afl — для OCaml.
— sharpfuzz — основанный на afl для .net фаззер.
— python-afl — для Python.
— afl.rs — для фаззинга программы на Rust
— afl-fuzz-js — afl-fuzz для javascript.
— java-afl — AFL фаззинг для Java
— kelinci — еще один фазер для Java со статьей на эту тему
— javan-warty-pig — AFL-like фаззер для JVM.
— afl-swift — для фаззинга программы на swift
— ocamlopt-afl — для OCaml.
— sharpfuzz — основанный на afl для .net фаззер.
Auxiliary tools
In this section, we picked up various scripts and tools for working with AFL and divided them into several categories:
Processing Kresh
- afl-utils is a set of utilities for automatic processing / analysis of kreshy and minimization of test cases.
- afl-crash-analyzer - Another Klesh analyzer for AFL.
- fuzzer-utils - a set of scripts for analyzing the results.
- atriage is a simple triage tool.
- afl-kit - Copied from afl-cmin.
- AFLize is a tool that automatically generates package debian builds suitable for afl.
- afl-fid - a set of tools for working with input data.
Work with code coverage
- afl-cov - provides human readable coverage data.
- count-afl-calls — rati score. The script counts the number of instrumented blocks in a binary.
- afl-sancov is like afl-cov, but uses a clang sanitizer.
- covnavi is a script for code coverage and analysis from the Cisco Talos Group.
- LAF LLVM Passes - something like a collection of patches for afl, which modify the code so that it is easier for the fuzzer to go through the branches
Several scripts to minimize test cases
- afl-pytmin is a wrap for afl-tmin that attempts to speed up the process of minimizing the test case by using multiple CPU cores.
- afl-ddmin-mod is a variation of afl-tmin based on the ddmin algorithm.
- halfempty is a fast utility based on parallelization to minimize test cases from Tavis Ormandy.
For distributed start
- disfuzz-afl - distributed fuzzing for afl.
- AFLDFF is a framework for distributed fuzzing with AFL.
- afl-launch is a tool for launching a set of instances afl.
- afl-mothership - manage and launch multiple synchronized AFL fuzzers on the AWS cloud.
- afl-in-the-cloud is another script to run afl in AWS.
- VU_BSc_project - fuzz testing of open source libraries with libFuzzer and AFL.
Also very recently a very good article “Scaling AFL to a 256 thread machine” was published on this topic , describing the launch of AFL on 256 threads.
Deployment, management, monitoring, reporting
- afl-other-arch is a set of patches and scripts for simply adding support for various (non-x86) architectures in AFL.
- afl-trivia - several small scripts to simplify AFL management.
- afl-monitor is a script for monitoring AFL operation.
- afl-manager is a python web server for managing multi-afl.
- afl-tools is a docker image with afl-latest, afl-dyninst and Triforce-afl.
- afl-remote is a web server for remote instantiation management afl.
AFL modifications
The AFL has greatly influenced the vulnerability search community in the fuzzing direction itself. And not surprisingly, over time, various modifications inspired by the original AFL began to appear on the basis of his ideas. In this section, we consider them. Each of these modifications has its own advantages and disadvantages compared to the original AFL version in different situations.
Immediately we say that if there are problems with the installation or do not want to waste time - almost any modification can be found on hub.docker.com.
Why?
- Increase speed and / or code coverage
- Algorithms
- Environment
- OS
- Iron
- Work in conditions without source code
- Code emulation
- Code Instrumentation
- Static
- Dynamic
Built-in AFL modes of operation
Before proceeding to the examination of various modifications and forks of AFL, it is necessary to talk about two important modes, which were once also modifications, and eventually became built-in modes. This is Syzygy mode and Qemu mode.
Syzygy mode - is the mode of operation in the instrument.exe tool
instrument.exe --mode=afl --input-image=test.exe --output-image=test.instr.exe
This mode requires: Statically rewrite PE32 binaries with AFL, symbols are required, Requires additional dev to make WinAFL kernel aware.Qemu mode - How it works under QEMU, you can see here “Internals of AFL fuzzer - QEMU Instrumentation”. Support for working with binaries using QEMU appeared in upstream AFL with Version 1.31b. The afl qemu mode works with the added functionality of binary code instrumentation into the qemu tcg binary translation engine (tiny code generator). To do this, afl has a qemu build script, which downloads the source code of a specific (2.10.0) version of qemu, imposes several small patches on them and compiles them for a given architecture. After that, the file afl-qemu-trace, which is in fact a user-mode (emulation of only executable ELF files) emulation of qemu-, is submitted. Due to this, it is possible to use fuzzing with feedback on elf-binaries, and for a heap of different architectures supported by qemu. Plus you get all the cool tools afl, starting with a convenient screen with information about the current session and ending with advanced things, such as afl-analyze. But we must remember that you also receive qemu restrictions. Also, for example, if the file is assembled by a toolchain using SoC hardware features, on which the binary is run and which is not supported by qemu, the fuzzing will terminate as soon as a specific instruction is encountered, or, for example, a specific MMIO is used.
There is also such an interesting fork qemu mode, where the speed was increased 3x-4x due to TCG code instrumentation and caching.
Forks
The appearance of forks AFL is primarily associated with changes, improvements in the algorithms of the classic AFL.
- afl-cygwin is an attempt to port classic AFL to Windows using Cygwin. Unfortunately, this attempt is quite basic, slow, and development can be said to be abandoned.
- AFLFast (extends AFL with Power Schedules) is one of the first forks of AFL, all kinds of heuristics were added, thanks to which it could go more ways in a short period.
- FairFuzz is an extension for AFL, the goal of which is to try to devote more time to rarer branches.
- AFLGo is an extension for AFL, which is primarily intended for the targeted achievement of certain parts of the code, and not the general coverage of the program code. This can be used to test patches or newly added code patches.
- PerfFuzz is an extension for AFL that is looking for test cases that could slow down the program as much as possible.
- Pythia is an extension for AFL, which is intended to add prediction elements to the phasing process in terms of the difficulty of finding new paths.
- Angora is one of the most recent released fuzzers, written in rust. Uses its new strategies for mutation and to increase coverage.
- Neuzz - fuzzing attempt using neural networks.
- UnTracer-AFL - afl integration with UnTracer, for efficient tracing.
- Qsym - Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In essence, this is a symbolic character engine (the main components are implemented as a plug-in for intel pin), which in combination with afl implements hybrid fuzzing. This is a further evolution in the feedback based fuzzing topic and deserves a separate discussion. His main merit is that he can very quickly (relative to the others) perform the concolic execution. This is due to the native execution of commands without intermediate code presentation, getting rid of the snapshot mechanism and a number of heuristics. It uses the old Intel pin (due to a number of support problems between libz3 and other DBT), and can currently work with elf x86 and x86_64 architectures.
It is worth saying that there are a large number of academic works related to the implementation of new approaches, a fuzzing technician, where AFL is taken and modified. In addition to whitepaper, nothing else is available, so we did not even mention such implementations. If you're interested, they are easy to google. For example, from the latter it is CollAFL: Path Sensitive Fuzzing , EnFuzz , Smart Greybox Fuzzing , ML for afl.
Modifications based on Qemu
- TriforceAFL - AFL / QEMU fuzzing with full system emulation . Fork from nccgroup. Allows in qemu mode to fuzz the entire operating system. Implemented through a special instruction (aflCall (0f 24)), which was added to QEMU x64 CPU. Unfortunately, it is no longer supported, the latest version of afl is 2.06b.
- TriforceLinuxSyscallFuzzer - fuzzing Linux system calls.
- afl-qai is a small demo project with QEMU Augmented Instrumentation (qai).
Modification based on KLEE
kleefl - for generating test cases by means of symbolic execution (very slow on large programs).
Modifications based on Unicorn
afl-unicorn - allows you to fuzz pieces of code by emulating it on the Unicorn Engine. We also successfully used this variation of AFL in our practice, namely, in sections of the code of one RTOS that was run on SOC, and it was impossible to use QEMU mode. It is advisable to use this modification in the case when there are no sources (you cannot build a stand-alone binary for parser analysis) and the program does not accept input data directly (for example, it is encrypted or represents signal samples as in one CGC binary). to reverse and find the expected location-functions, where this data is processed in a convenient format for the fuzzer and which can be iterated. This is the most common modification of AFL. In the sense that it allows you to literally fuck everything. That is, it does not depend on the architecture, the availability of sources, the format of the input data and the format of the binary itself (the most striking example of bare-metal is just pieces of code from the memory of the controller). The researcher pre-examines this very binary and writes a fuzzer, which emulates the state at the input to the parser procedure, for example. It can be seen that, unlike AFL, you need to first do some research on binaries. For bare-metal firmware, such as Wi-Fi or baseband, there are just a number of drawbacks to keep in mind:
- It is necessary to somehow localize the checksum check.
- It should be borne in mind that the state of a fuzzer is a state of memory that was stored in the memory dump, this may prevent the achievement of certain paths for the fuzzer.
- There is no sanitization of calls to dynamic memory, but it can be implemented manually (also by spending effort), and it will depend on RTOS (it must also be investigated beforehand).
- The cross-task interaction of RTOS is not emulated - it is also possible to prevent certain ways from being found by a fuzzer.
An example of working with this modification is “afl-unicorn: Fuzzing Arbitrary Binary Code” and
“afl-unicorn: Part 2 - Fuzzing the 'Unfuzzable'” .
Before we proceed to the modifications based on dynamic binary instrumentation (DBI) frameworks, we immediately recall that DynamoRIO shows the highest speed of these frameworks, then DynInst and at the end PIN.
PIN based modifications
- aflpin - AFL with an Intel PIN tool.
- afl_pin_mode - Another AFL tool implemented via an Intel PIN.
- afl-pin - AFL with PINtool.
- NaFl - A clone (of the basic core) of AFL fuzzer.
- PinAFL - the author of the tool tried to transfer AFL to Windows for fuzzing already compiled binaries. Apparently, more was done for fan in one evening, and then the project does not develop. The repository does not contain source codes, only collected binaries and instructions for launching. The AFL version on which this tool is based is not listed, and only supports 32-bit applications.
As you can see, there are a lot of different modifications, but there is not much use of them in real life.
Modifications based on Dyninst
afl-dyninst- American Fuzzy Lop + Dyninst == AFL blackbox fuzzing. The chip of this version is that the program originally examined (without the source code) is statically instrumented (static binary instrumentation, static binary rewriting) using DynInst, and then fuzzing with a classic AFL that thinks the program is built using afl-gcc / afl -g ++ / afl-as;) As a result, it gives us the opportunity to work without source code and with very good performance - It used to be at 0.25x speed compared to a native compile. At the same time, there is a significant advantage over QEMU, which is the ability to instrument dynamically linked libraries. At the same time, QEMU is only able to instrument the main executable file statically linked with the libraries. Unfortunately, now it is relevant only for the Linux operating system. For Windows support, changes are needed in DynInst itself, and there it goeswork .
You can also pay attention to such a fork where it was pumped well for various features (support for AARCH64 and PPC architectures) and speed;)
Modifications based on DynamoRIO
- drAFL - AFL + DynamoRIO = fuzzing without source on Linux.
- dr-the afl - another implementation based on DynamoRIO, which in great detail already painted in the vast Habra.
- afl-dynamorio - a modification from vanhauser-thc (amateur pumping and stabilizing AFL). about this version, he says so: “run AFL with DynamoRIO when normal afl-dyninst is crashing.” From pleasant there is added support for ARM and AARCH64. With regards to performance: DynamoRIO is about ~ 10 slower than Qemu, ~ 25 slower than dyninst - but ~ 10 faster than Pintool.
- Winafl- the most famous afl fork for Windows. (DynamoRIO, there is also syzygy mode). The appearance of this modification was only a matter of time, since the desire to try out the AFL under Windows on applications for which there are no source codes appeared to many. At the moment, the tool is being actively developed, and despite the use of a slightly lagging AFL code base, (2.43b at the time of this writing), several vulnerabilities have already been found with it (CVE-2016-7212, CVE-2017-0073, CVE-2017- 0190, CVE-2017-11816). It should be noted that the main developers are specialists from the Google Zero Project team and MSRC Vulnerabilities and Mitigations Team, which gives reason to hope for further active development of the project. To implement a fuzzer, the developers have moved away from compile-time instrumentation to using dynamic instrumentation (based on DynamoRIO), what was expectedly slowed down the execution of the investigated software, but the resulting overhead (two-fold) is comparable to the work of the classic AFL in binary mode. Also, the developers have solved the question of a long start of the process, calling it persistent fuzzing mode, they choose the function that needs to be fuzzed (by offset within the file or by name if the function is presented in the export table) and instruct it so that it can be called in a loop, thereby running multiple input data samples without restarting the process. Also recently appeared an interesting need to fuzz (by offset inside the file or by name if the function is presented in the export table) and instrument it so that it can be called in a loop, thereby triggering several samples of input data without restarting the process. Also recently appeared an interesting need to fuzz (by offset inside the file or by name if the function is presented in the export table) and instrument it so that it can be called in a loop, thereby triggering several samples of input data without restarting the process. Also recently appeared an interestingAn article in which researchers showed how they found using winafl ~ 50 vulnerabilities in ~ 50 days. At the same time, almost before the publication of the article in WinAFL, Intel PT mode was also added (more on this a little further) - the details are here .
The advanced / sophisticated reader may note that there are modifications with all popular instrumentation frameworks, with the exception of Frida - indeed it is. The only mention of using Frida with AFL was found only in Chizpurfle: A Gray-Box Android Fuzzer for Vendor Service Customizations . The AFL version with Frida was really useful since Frida well supports a number of RISC architectures.
Many researchers are also eagerly awaiting the release of the DBI framework Scorpio from the creator of Capstone, Unicorne, Keystone. Based on this framework, the authors themselves have already made a fuzzer (Darko) and, according to them, successfully use it for fuzzing embedded devices. More on this can be found in the work."Digging Deep: Finding 0days in Embedded Systems with Code Coverage Guided Fuzzing . "
Modifications based on CPU hardware
When it comes to AFL modifications with support for the processor's hardware capabilities, this first of all indicates the possibility of fuzzing kernel code, and secondly, a higher fuzzing rate for applications without source code.
And, of course, first of all, we are talking about the hardware capabilities of the processor, like Intel PT (Processor Tracing). Which is available starting from the 6th generation of processors (approximately from 2015). Naturally, in order to use the following fuzzers, you will need hardware with the appropriate support for Intel PT.
- WinAFL-IntelPT is a third-party modification of WinAFL that already uses Intel PT technology instead of DynamoRIO.
- kAFL is an academic development aimed at solving the problem of coverage-guided for kernel phasing in an OS in an independent manner. What is solved by using the hypervisor and Intel PT technology. Learn more from their whitepaper, “kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels” .
Conclusion
As you can see, this topic is actively developing. At the same time there is a large space for creativity to create a new, interesting and useful modification of AFL.
Thank you for your attention and successful fuzzing!
Coauthor: Nikita Knyzhov
P.S. Thanks to the whole team of the research center for their help in preparing this material, without their experience and help to prepare such a thing would be impossible.