Use offensive techniques to enrich Threat Intelligence

    Today, Threat Intelligence, or the active collection of information on information security threats, is a tool of prime necessity in the process of identifying information security incidents. Typical TI sources include free subscriptions with malicious indicators, bulletins from manufacturers of equipment and software with descriptions of vulnerabilities, reports of security researchers with detailed descriptions of threats, as well as commercial subscriptions from TI vendors. At the same time, information obtained from the above sources is often not sufficiently complete and relevant. Improving the efficiency and improving the quality of TI can be facilitated by the use of OSINT (open source intelligence) and offensive methods (that is, methods that are characteristic not for the defender, but for the attacker) in information security,


    This article is for informational purposes only. The author is not responsible for the possible harm and / or disruption of the networks and / or services of third parties related to the implementation of certain actions described in the article. The author also encourages readers to comply with the law.

    Not instead, but together

    Just want to make a reservation that it will not be about refusing or replacing Threat Intelligence information obtained from traditional paid and free sources, but about its addition and enrichment.

    Thus, the use of additional TI methods can increase its efficiency and help in solving a number of problems. For example, with another IoT epidemic affecting certain vulnerable versions of firmware, how soon can you get a feed with IP addresses of potentially vulnerable devices in order to quickly detect their activity on the perimeter? Or, if the monitoring system rules are triggered by an indicator (IP address or domain name) that has been flagged malicious a year or more ago, how to determine whether it still remains malicious, provided that it initiate any additional verification of the indicator as "For now" is usually impossible.

    That is, addition and enrichment will be especially useful for intermediate indicators according to the “pyramid of pain” by David Bianco [0] (IP addresses, domain names, network artifacts), however in some circumstances it is possible to obtain new long-lived indicators (up the pyramid) by applying appropriate analytics.

    Internet scanning

    One method useful for obtaining Threat Intelligence is network scanning.

    What is usually collected when scanning
    Чаще всего при проведении сканирования наиболее интересные результаты дает сбор так называемых «баннеров» – ответов прикладного ПО сканируемой системы на запросы сканера. В «баннерах» содержится достаточно много данных, идентифицирующих прикладное ПО и его различные параметры на «той стороне», и делается такая проверка достаточно быстро.

    When solving the problem of expanding the scope of Threat Intelligence, the purpose of the scan will be the entire Internet. When scanning a public address space (~ 3.7 billion IPv4 addresses, excluding reserved addresses), you can extract the following useful information:

    • Which nodes are exposed to vulnerabilities that are widely exploited within the framework of actual malicious campaigns and, in view of this fact, constitute potential sources of harmful effects.
    • Which nodes are the managing servers of the botnet networks [1] , which can be identified by the compromised node in the protected perimeter.
    • Which nodes belong to the non-public part of distributed anonymous networks that can be used for invisible, uncontrolled exit beyond the protected perimeter.
    • More information about the nodes, "lit up" in the response of the monitoring system.

    Tool selection

    Over the years, the development of information networks has created a large number of tools for network scanning. Among the current software are the following scanners:

    Briefly consider the main advantages and disadvantages of these tools for the enrichment of TI in the Internet space.


    Perhaps the most famous network scanner, created 20 years ago. Due to the possibility of using custom scripts (via the Nmap Scripting Engine), this is an extremely flexible tool, not limited to just collecting application banners. At the moment, quite a lot of NSE scripts have been written, many of which are available for free. Since Nmap tracks every connection, in other words, has a synchronous nature, it is not worth using it for large networks such as the Internet because of the low speed of the scanner. At the same time, it is advisable to use this tool on small samples of addresses obtained by faster tools, due to the power of the NSE.

    About Nmap speed
    Несмотря на изначально невысокую производительность, с момента выхода данного сканера его разработчиками было добавлено много функциональных возможностей, повышающих скорость сканирования. В их числе параллельные режимы работы и задание рейта в пакетах в секунду. Однако даже с «выкрученными» настройками Nmap отстает от своих асинхронных конкурентов по скорости. На примере сканирования /15 подсети по 443/tcp-порту с виртуальной машины с 1 CPU Xeon Gold 6140 и гигабитным каналом получаем:

    • 71.08 секунды для Nmap с параметрами -T5 -n -Pn -p 443 --open --min-rate 1000000 --randomize-hosts --defeat-rst-ratelimit –oG nmap.txt
    • 9.15 секунды для Zmap с параметрами -p 443 –o zmap.txt


    Well-known, though not the first of its kind asynchronous network scanner, which appeared in 2013. According to the report of the developers, published at the 22nd USENIX Security Simposium conference [5] , the tool is able to scan the entire public address range running on an average computer with a gigabit Internet connection in less than 45 minutes (one port each). The advantages of Zmap include:

    • High speed; using the PF_RING framework and 10-gigabit connectivity, the theoretical scan time for the public IPv4 address space is 5 minutes (one port each).
    • Support random scans of network addresses.
    • Module support for TCP SYN scans, ICMP, DNS queries, UPnP, BACNET [6] , UDP probes .
    • Presence in the repositories of common distributions of GNU / Linux (in addition, there is support for Mac OS and BSD).
    • In addition to Zmap, a series of related utility projects have been developed that extend the functionality of the scanner.

    If we talk about the disadvantages of Zmap, here we can note the possibility of scanning addresses only on one port as at the moment.

    Among the Zmap-related projects that extend its functionality for tasks of interest, the following can be distinguished:

    • ZGrab is an application protocol scanner with support for TTP, HTTPS, SSH, Telnet, FTP, SMTP, POP3, IMAP, Modbus, BACNET, Siemens S7 and Tridium Fox protocols (advanced banner grabber).
    • ZDNS is a utility for quick DNS queries.
    • ZTag is a utility for tagging scan results issued by ZGrab.
    • ZBrowse is a Chrome headless distribution utility for tracking website content changes.

    About the possibilities of ZGrab
    Продвинутый сканер прикладных протоколов ZGrab, для которого уже вышла версия 2.0 (полного функционального паритета с версией 1.Х пока нет, однако в ближайшее время он будет достигнут, после чего можно будет говорить о полном замещении старой версии новой), позволяет достаточно гибко фильтровать результаты сканирования благодаря поддержке структурированного формата вывода (JSON). Например, для сбора только Distinguished Name Issuer’а и Subject’а из HTTPS-сертификатов достаточно простой команды:

    zmap --target-port=443 --output-fields=saddr | ./zgrab2 tls -o - | jq '.ip + "|" + .data.tls.result.handshake_log.server_certificates.certificate.parsed.issuer_dn + "|" + .data.tls.result.handshake_log.server_certificates.certificate.parsed.subject_dn' > zmap-dn-443.txt


    An asynchronous scanner created in the same year as Zmap, according to the command syntax, is similar to Nmap.

    The creator (Robert David Graham) declared greater productivity than all asynchronous scanners that existed at that time, including Zmap, achieved through the use of a custom TCP / IP stack.

    Advantages of the scanner:

    • High performance with a theoretical performance of up to 10 million packets per second, which allows you to scan all IPv4 public space in a few minutes (one port) on a 10-gigabit connection - the same as in Zmap using the PF_RING framework.
    • TCP, SCTP and UDP protocol support (via sending UDP payloads from Nmap).
    • Support random scans of network addresses.
    • Supports selection of both IP range and port range when scanning.
    • Support for various formats of output scan results (XML, grepable, JSON, binary, as a list).

    At the same time, the scanner has several disadvantages:

    • In connection with the use of a custom TCP / IP stack, it is recommended to run a scan on a dedicated IP to avoid conflicts with the OS stack.
    • Compared with ZGrab, banner grabber’s options are quite limited.

    Scanner Comparison Chart

    The summary table below presents a comparison of the functionality of the scanners considered.

    Values ​​in table rows
    "++" – очень хорошо реализовано
    "+" – реализовано
    "±" – реализовано с ограничениями
    "-" – не реализовано

    Selection of targets and scanning method

    After choosing the tools to use, you need to decide on the object and purpose of the scan, that is, understand what and why to scan. If with the first, as a rule, there are no questions (this is a public IPv4 address space with a number of exceptions), then with the second, everything depends on the desired end result. For example:

    • To identify nodes affected by a particular vulnerability, it is wise to first go over the scanned range with a fast tool (Zmap, Masscan) to identify nodes with open ports of vulnerable services; after finalizing the list of addresses, use Nmap with the appropriate NSE script, if one exists, or create a custom script based on the available information about the vulnerability in security researchers publications. Sometimes simple enough grep'a on the collected application banners, because many services display information about their version in it.
    • To identify the nodes of the hidden infrastructure - both anonymous networks and C2 nodes (that is, the managing servers of the botnet networks) - you will need to create custom payloads / probes.
    • For solving a wide range of tasks on identifying compromised nodes, parts of the infrastructure of anonymous networks and others, quite a lot can be obtained by collecting and parsing certificates during a TLS handshake.

    Selection of infrastructure

    The fact that modern fast scanners can show impressive results on fairly modest resources of an average processor does not mean that the load on the network infrastructure will be insignificant. Due to the specifics of the traffic generated during scanning, a rather significant packet rate per second is obtained. According to experience, one core of an average processor (on a CPU manufactured in recent years, in a virtual environment, under any OS, without any tuning of the TCP / IP stack) is able to generate a stream of about 100-150 thousand packets per second per ~ 50 Mbps band that is a serious load for software routers. Hardware networking equipment may also have problems reaching the ASIC performance limit. At the same time, with a scanning speed of 100-150 Kpps (thousands of packets per second), the crawling of the public IPv4 range may take more than 10 hours, i.e. quite a significant amount of time. For a really fast scan, you should use a distributed network of scanning nodes that divide the scanned pool into parts. It is also necessary to have a random scan order so that the utilization of Internet channels and batch loading of the providers' equipment on the “last mile” is not significant.

    Difficulties and "thin" moments

    In practice, when conducting mass network scans, one may encounter a number of difficulties of both technological and organizational nature.

    Performance and Availability Issues

    As mentioned earlier, a massive network scan creates a stream of traffic with a high PPS value, causing a heavy load on both the local infrastructure and the network of the provider to which the scan nodes are connected. Before carrying out any scans for research purposes, it is imperative to coordinate all actions with local administrators and representatives of the provider. On the other hand, performance problems may occur on the end scanning nodes when renting virtual servers on Internet hosting.

    Not all VPS are equally useful.
    На VPS, арендованных не у мейнстримовых хостеров, можно столкнуться с ситуацией, когда на производительных VM (по нескольку vCPU и парой гигабайт памяти) пакетные рейты Zmap и Masscan не поднимаются выше пары десятков Kpps. Чаще всего негативную роль играет сочетание устаревшего железа и неоптимальной комбинации ПО виртуальной среды. Так или иначе, по опыту автора, более или менее существенные гарантии производительности можно получить только у компаний – лидеров рынка.

    Also, when identifying vulnerable nodes, it should be borne in mind that certain NSE checks can lead to a denial of service on scanned nodes. Such a development of the situation is not just not ethical, but is quite definitely outside the legal field and entails the onset of consequences in the form of responsibility, including criminal liability.

    What are we looking for and how?

    In order to find something, you need to know how to look for it. Threats are constantly evolving, and new analytical work is constantly needed to identify them. Sometimes analytics published in research documents become obsolete, prompting them to conduct their own research.

    One famous example
    В качестве примера можно взять исследование [5] 2013 года авторов Zmap, посвященное выявлению узлов TOR-сети. На момент публикации документа путем анализа цепочки сертификатов и выявления особым образом сгенерированных subject name в самоподписанных сертификатах удалось найти ~67 тысяч TOR-узлов, работающих на порту tcp/443, и ~2,9 тысяч узлов, работающих на tcp/9001. В настоящее время перечень выявляемых данным способом TOR-узлов оказывается гораздо меньше (применяется большее разнообразие портов, используются obfs-транспорты, происходит миграция на Let's Encrypt-сертификаты), что вынуждает использовать иные аналитические методы для решения подобной задачи.

    Abuse complaints

    When scanning the Internet, there is almost a 100% risk of receiving a stream of abuse complaints. Especially annoying automatically generated abuse on a couple of dozen SYN-packages. And if scans are repeated on a regular basis, a stream of manually created complaints may also go.

    Who complains most often
    Основные источники abuse (в основном автоматических) – образовательные учреждения в доменной зоне *.edu. По опыту особенно следует избегать сканирования адресов из ASN, принадлежащих Индианскому Университету (США). Также в презентациях авторов Zmap [5] и Masscan [7] можно найти несколько занимательных примеров неадекватной реакции на сетевое сканирование.

    You should not be frivolous about this abuse, because:

    • In case of complaints, the hoster most often blocks traffic to or from rented virtual machines or even suspends their operation.
    • The ISP may disable uplink in order to eliminate the risks of autonomous system blocking.

    Best practice for minimizing complaints is to do the following:

    1. Familiarize yourself with the terms of service of your internet / hosting provider.
    2. Creating a scan of the information page on the source addresses, which tells about the purpose of scanning; Making the appropriate explanations in the DNS TXT-records.
    3. Clarification of scans when receiving abuse complaints.
    4. Maintain a list of exclusions from scanning, add requests to the subnets on request, provide clarifications on the procedure for entering exceptions on the information page.
    5. Scanning is no longer and no more frequently than is required to solve the problem.
    6. Distribution of scanning traffic by source, destination-addresses, and time when it is possible.

    Network Scan Outsourcing

    Conducting independent mass scans requires a weighted approach (the risks are described above), and you need to be aware that if the analytical processing of the scan results is not deep enough, then the final result may be unsatisfactory. At a minimum, the technical part can be facilitated by using existing commercial tools, such as:

    More services
    Кроме упомянутых сервисов, существует достаточно большое число проектов, так или иначе связанных с глобальным сетевым сканированием. Например:

    • (бывший поисковик по вебу PunkSPIDER)
    • Project Sonar (датасет с результатами сетевых сканов от Rapid7)
    • Thingful (еще один поисковик по IoT)


    The first and most well-known search engine for published services on the Internet, which was originally created as a search engine for IoT and was launched in 2009 by John Materley [11] . At the moment, the search engine offers several levels of access to information within itself. Without registration, the possibilities are very basic. When registering and purchasing a membership, the following functionality becomes available:

    • Search by basic filters.
    • Issue of detailed information about the nodes ("raw data").
    • "Basic" API with the ability to integrate with common utilities, such as Maltego, Metasploit, Nmap and others.
    • The ability to export search results (for a fee, for "Credits" - the internal currency of the service).
    • Ability to scan on request individual addresses and ranges (for a fee, for "Credits").

    The mentioned possibilities are already enough for the basic OSINT [12] , however, the search engine fully discloses its capabilities with an Enterprise subscription, namely:

    • On-demand scanning of the entire public IPv4 range of the Internet, including checking the specified range of ports and specifying protocol-specific banner collectors.
    • Real-time notifications on the results of triggered scans.
    • The ability to upload raw scan results for further analysis (no limitations, the entire database is available).
    • The possibility of developing custom collectors for specific tasks (non-standard protocols, etc.).
    • Access to historical scan results (until October 2015).

    As part of the Shodan project, a number of related projects were created, such as a malware-hunter, honeypot scan, exploits, which enrich the scan results.


    Search engine for IoT and more, created by Zmap author Zakir Durumerik and publicly presented in 2015. The search engine uses Zmap technology and a number of related projects (ZGrab and ZTag). Without registration, unlike Shodan, the search engine is limited to 5 requests per day from one IP. After registration, the use possibilities are expanded: the API appears, search results are increased by queries (up to 1000 results), but the most complete access, including uploading historical data, becomes available only on the Enterprise plan.

    The advantages of the search engine over Shodan include the following features:

    • There are no restrictions on search filters on basic tariff plans.
    • The high speed of updating the results of automatic scanning (daily, according to the authors; the speed of Shodan is significantly lower).

    The disadvantages include:

    • Fewer ports scanned compared to Shodan.
    • The inability to scan "on demand" (both by IP-ranges and ports).
    • The smaller depth of enrichment of the results out of the box (which is obvious, given the greater number of Shodan tools implemented through related projects).

    If there are enough results collected on the Censys-supported port range to achieve this goal, the lack of on-demand scanning should not be a serious obstacle due to the high speed of updating scan results.


    IoT search engine created by the Chinese IB-company Knowsec Inc in 2013. To date, the search engine works using its own development - Xmap (host scanner) and Wmap (web scanner). The search engine collects information on a fairly wide range of ports, while searching it is possible to categorize according to a large number of criteria (ports, services, operating systems, applications) with details on each host (the contents of the application banners). The search results display a list of possible vulnerabilities for the identified application software from the related project SeeBug (without testing for applicability). For registered accounts, API and web search are also available with a full set of filters, but with a limit on the number of displayed results. To remove the restrictions proposed acquisition of Enterprise-plan. Of the advantages of the search engine can be noted:

    • A large number of scanned ports (roughly similar to Shodan).
    • Extended web support.
    • The identifier of the application software and OS.

    The disadvantages include:

    • Slow update interval for scan results.
    • Lack of scanning capability on request.
    • Enrichment of scan results is limited to the identification of services, application software, operating systems and potential vulnerabilities.

    Comparative table of services

    The summary table below provides a comparison of the functionality of the services considered.

    Values ​​in table rows
    "++" – очень хорошо реализовано
    "+" – реализовано
    "±" – реализовано с ограничениями
    "-" – не реализовано

    We scan ourselves

    Own infrastructure can also be a useful source of TI so that you can understand which new hosts and services are on the network, are they vulnerable or even malicious? If it is possible to use external services for perimeter scanning (for example, this usage scenario is officially supported by Shodan), then all actions inside the perimeter can only be performed independently. The range of tools for network analysis in this case can be quite extensive: these are both passive analyzers, such as Bro [13] , Argus [14] , Nfdump [15] , p0f [16] , and active scanners — Nmap, Zmap, Masscan and their commercial competitors. And IVRE framework can help in interpreting the collected results.[17] , allowing you to get your own Shodan / Censys-like tool.

    The framework was developed by a group of IB researchers, one of the authors is the active developer of the scapy utility [18] Pierre Lalette [19] . The framework features include:

    • The use of visual analytics to identify patterns and abnormal deviations.
    • Advanced search engine with detailed parsing of scan results.
    • Integration with third-party utilities through an API with Python support.

    IVRE is also well suited for analyzing large scans of the Internet.


    Scanning and active network intelligence are excellent tools that have long been used by various researchers in the field of information security. However, for the "classic" bezopasnik such methods of work is still new. The use of OSINT and offensive-methods in combination with classical defensive-means will significantly enhance protection and ensure its pro-activity.

    Mikhail Larin, expert at the Jet CSIRT Information Security and Response Center, Jet Infosystems.

    Also popular now: