Sigma rules. Craft or new standard for SOC

    I am Sergey Rublev, head of SOC (Security Operations Center) at Infosecurity.
    In this article, I will examine in detail the ambitious project Sigma Rules , whose motto is: "Sigma for logs is like Snort for traffic and Yara for files."



    It will be about three aspects:

    • Applicability of Sigma-rule syntax for maintaining a knowledge base of threat detection scripts
    • Capabilities of rule generation tools for boxed SIEM systems
    • Value for the SOC of the current content of the public repositories of Sigma rules

    Once upon a time, in a far, far galaxy


    It all started a few years ago when the trees were large, and our monitoring team was still small. We faced a lot of questions, almost any team that grows into a three-person line goes through this.



    The reasons for the appearance of questions are different:

    • Team growth
    • Staff turnover
    • A large number of heterogeneous systems for monitoring

    If you have to take on support already configured by someone SIEM, the number of questions grows like an avalanche.

    Use Case Library


    The world experience in building monitoring centers has already come up with a solution for organizing chaos and its name is the library of case studies. The purpose of each case is to comprehensively describe the solution to a problem in the framework of information security monitoring.

    The composition of knowledge laid down in each case may vary; we proceed from the following set:

    • Objective - the task solved by the case
    • Threat - the threat that the detection rule seeks to detect.
    • Stakeholders - people interested in this rule: IB / IT / Business
    • Data Requirements - the data set required to identify a threat
    • Logic - threat detection rule logic
    • Testing - an algorithm for testing the correctness of the detection rule
    • Priority - priority of event processing by case (usually calculated from the potential damage from a successfully implemented threat)
    • Output - A list of actions for parsing the alert, a description of the correct exits from the parsing procedure and the composition of the data recorded in the parsing results

    Use case example for the task of detecting communication with the botnet management server (C&C or simply C2): The



    example is significantly simplified, in reality, the case with proper description grows to a multi-page document.

    At that moment, when the number of cases exceeded several tens, we started looking for ready-made tools for maintaining such a knowledge base, preferably having, in addition to human friendly, also some kind of machine-friendly interface for work.

    Sigma Project


    The Sigma project certainly deserves consideration in the context of the knowledge base on incident detection rules. It started in 2016, and I have been following it almost from the very beginning.

    In fact, the project consists of

    • Sigma rules themselves
    • Utilities for converting rules into queries for various SIEM systems

    The SIEM list is impressive: almost all popular event analysis solutions are present. Further about everything in detail and in order.

    Rule syntax


    Sigma rules are YAML documents describing a scenario for detecting a specific attack. Syntactically, the rules consist of the following blocks:

    Meta information


    Descriptive part to structure and simplify the search for the necessary rules.

    title: Access to ADMIN$ Share
    description: Detects access to $ADMIN share
    author: Florian Roth
    falsepositives: 
        - Legitimate administrative activity
    level: low
    tags:
        - attack.lateral_movement
        - attack.t1077
    status: experimental

    I would also like to note that many rules are already provided with links to the attack technique according to the MITER ATT & CK methodology.

    Data Source Declaration


    Description of the source based on the events of which detection logic is implemented.

    logsource:
        product: windows
        service: security
    

    It is syntactically possible to describe both the end service of a particular product and a whole category of systems.

    Processing Logic Declaration


    At the detection logic level, the following are described:

    • Searched patterns
    • Values ​​of certain fields in the log
    • Time frame
    • Aggregate Functions

    Logic can be trivial, for example, conditions imposed on a set of fields:

    detection:
        selection:
            EventID: 5140
            ShareName: Admin$
        filter:
            SubjectUserName: '*$'
        condition: selection and not filter
    

    and quite complicated:

    detection:
        selection1:
            EventID:
                - 529
                - 4625
            UserName: '*'
            WorkstationName: '*'
        selection2:
            EventID: 4776
            UserName: '*'
            Workstation: '*'
        timeframe: 24h 
        condition:
            - selection1 | count(UserName) by WorkstationName > 3
            - selection2 | count(UserName) by Workstation > 3
    

    The expressive means of the language, although not universal, are still quite wide and allow you to describe a large number of cases for identifying attacks.

    Rule Development Tools


    In addition to your favorite text editor for YAML, WEB UI from SOC Prime is also available, which allows you to both validate the syntax of an already written rule and create rules manually from graphic blocks.



    Sigma as a Knowledge Base Tool


    To summarize a brief summary.

    Currently, the rules syntax mainly concentrates on the description of the threat detection logic and is not intended for a comprehensive description of the use case; accordingly, it will not work to maintain a full-fledged library using Sigma Rules only.

    For the use case structure that we have chosen, Sigma closes only half (Objective, Data requirements, Logic and Priority).



    Convert to various SIEM


    Since we are a service provider of SOC services, the idea to keep all our developments according to the correlation rules in some universal format and at the implementation stage to convert to the desired SIEM format seemed very attractive to us.

    The project includes console utilities for generating event requests in the format of various SIEMs. Consider what conversion is and what is under her hood.



    The conversion takes place in 3 stages:

    1. Parsing rules - I think this is clear: the YAML document is sorted into component blocks
    2. Reduction to SIEM taxonomy The
      necessity of this stage is related to the fact that normalization in SIEM systems is implemented in a slightly different way, so declarations from Sigma rules need to be reduced to the taxonomy of events of the selected SIEM
    3. Generating a request for SIEM
      For this stage to work, one more component is required - a backend for this SIEM.
      In fact, the backend is a plug-in for the conversion utility, which contains the logic for converting to the final request format in SIEM. The detection and logsource blocks are converted taking into account previously superimposed mapping of fields, additional SIEM-specific information is added.

    As a result, starting the conversion utility looks as follows:



    The following parameters are passed as parameters:

    • Target SIEM
    • The rule
    • Mapping file for this SIEM


    SOC Prime also has a ready-made UI for the conversion function ( uncoder.io )



    Pitfalls of conversion


    • After studying the mechanics of conversion, we encountered significant limitations, which kept us from converting all the developments to the Sigma format:
    • The converter operates only with a request. The correlation rule in SIEM affects more aspects: time window, aggregation, actions based on the results of identified alerts
    • Key features of individual SIEMs, such as ActiveLists, are not taken into account.
    • Insufficient detailing of field mapping - as part of the mapping configuration, the fields of only a few sources are described; accordingly, having rules for several tens of different types of event sources in the database, you will have to invest heavily in writing the mapping.

    Rule Base


    Let's see what the publicly available Sigma rule base carries. Currently, content is actively being added to two repositories:

    • The main project repository
    • SOC Prime Threat Detection Marketplace

    Rules in the repositories have a nonzero intersection.
    SOC Prime has a number of rules that apply to paid subscriptions; I do not consider their content in this article.

    For analytics, we need the sigmatools library for Python and some programming skills.

    To parse and load rules from a directory into a dictionary, you can use the following code:

    from sigma.parser.collection import SigmaCollectionParser
    import pathlib
    import itertools
    def alliter(path):
        for sub in path.iterdir():
            if sub.name.startswith("."):
                continue
            if sub.is_dir():
                yield from alliter(sub)
            else:
                yield sub
    def get_inputs(paths, recursive):
        if recursive:
            return list(itertools.chain.from_iterable([list(alliter(pathlib.Path(p))) for p in paths]))
        else:
            return [pathlib.Path(p) for p in paths]
    BASE_PATH = [r'sigma\rules']
    path_list = get_inputs(BASE_PATH, True)
    rules_map = {}
    for sigmafile in get_inputs(BASE_PATH, True):
        f = sigmafile.open(encoding='utf-8')
        parser = SigmaCollectionParser(f)
        rule = next(iter(parser))
        rules_map[rule['title']] = rule
    

    Deduplicating the same rules, the following picture emerges:



    As part of a unique list of rules, we obtain the following distributions:

    By type of event source:


    A bit larger statistics

    • Windows ~ 80%
    • Sysmon ~ 53%
    • Proxy ~ 8%
    • Linux ~ 4%

    Basically, the current content focuses on the Windows and Sysmon system, in particular, the rules for the rest of the systems are a few.

    By content availability:


    It turns out that the developers of Sigma-rules marked as stable less than 20% of all existing rules.

    To summarize


    In publicly available sources there are a large number of rules. They are regularly updated, and the rules for detecting indicators, and sometimes even the technician for the most high-profile APT companies, quickly appear.

    There are a lot of restrictions for applying the rules in real life:

    • There are a lot of rules for Microsoft Sysmon, which is rarely used in enterprise.
    • There are many rules that actually perform IoC checks (hashes, IP addresses, URLs, User Agents). Such rules quickly become obsolete, and there are more effective mechanisms for finding IoC than rules.
    • A lot of experimental content, respectively, additional requirements are imposed on high-quality testing before commissioning.

    At Infosecurity, we use the content of Sigma rules as an additional source of knowledge for more effective detection of incidents. If we find something interesting, we will already implement it within the framework of our correlation rules, which take into account both the kernel on which the rules work (Apache Spark) and the specifics of the infrastructures and the means of protection we use.

    Also popular now: