Common Event Format Inside



    One by one, manufacturers of software and hardware and software solutions claim to receive a certificate confirming support for the HP ArcSight Common Event Format (CEF) format: Stonesoft , Tripwire , Citrix , Imperva , NetScout and several dozen vendors ...



    About SIEM


    According to Gartner's annual reports, HP ArcSight (until September 2010 - ArcSight) is confidently among the leaders in developing SIEM class solutions . In short, the essence of such decisions is to collect in one place hundreds of thousands of millions of events generated by various information security systems, and perform a correlation analysis of these events. The correlation results in security incidents that a person is already working with - an administrator or an operator.

    Each manufacturer of SIEM solutions in their advertising brochures usually indicates the number of supported event sources. For example, it is stated that ArcSight ESM supports 300+ devices and applications, while QRadar SIEM has “only” 200+. Who is there and how do the sources think - now it doesn’t matter, because within the framework of this article we will be interested in the format of information security events (magazines).

    About magazines


    There are no general requirements for the structure of IS logs and events yet. Therefore, each developer draws up magazines in a way that is more convenient for him.

    Someone simply writes them to a text file, someone provides the ability to send data to the Syslog server. One only needs SNMP, and the other wants to put all the logs in a relational database. There is Microsoft with its .evt format, there is CheckPoint with its OPSEC. And do not forget about SDEE.

    About the problem


    Both the structure of the logs and the protocols used to transmit them are different everywhere. From the point of view of integration with SIEM solutions (and not only with them), this is bad. Because for the unified processing of logs they need to be normalized, i.e. lead to a single format. Events are easier to store in a single format. Easier to search for data and generate reports.

    Normalization involves parsing and mapping fields. Therefore, each connector between the information security system and the SIEM solution is an application that parses and maps in accordance with its configuration settings.
    If the SIEM manufacturer does not have support for the product you need, then you will have to perform configuration settings yourself. Design and test regular expressions. To study the structure of the database, master SQL * Plus. Write matching rules. Sad, in general ...

    Proposed solution


    “It would be nice if all manufacturers of information security systems alienated their logs in a way that is understandable to everyone,” thought ArcSight and, in 2006, submitted for general approval a review of the CEF format they developed. In my opinion, there is nothing complicated in this format. It is necessary to fulfill only a few requirements:

    Requirement No. 1 - We use Syslog as a transport.

    Here, it seems, it’s clear.
    I want - I use the UDP protocol, I want - TCP.

    Requirement No. 2 - We fill in eight required fields.

    Of course, the traditional syslog header should be present at the beginning of the message.
    Jan 18 11:07:53 host

    And already after it - a prefix CEF:and a set of required fields, separated by the symbol "|"
    CEF:Version|Device Vendor|Device Product|Device Version|Signature ID|Name|Severity|Extension

    Here:
    • Version - CEF format version
    • Device Vendor, Device Product and Device Version - these lines uniquely identify the event source. There are no products that have the same set of these three values
    • Signature ID - a unique identifier for the type of event
    • Name - human-readable description of the event
    • Severity - event severity (0 to 10)
    • Extension - see below


    Requirement No. 3 - The Extension field is filled in according to the CEF dictionary.

    The Extension field is a set of key-value pairs. Here are a few keys from the dictionary:

    dmac is the Destination Mac Address (for example, 00: 0D: 60: AF: 1B: 65)
    spt is the Source Port (port 0 to 65535)
    request is Request URL (in the case of HTTP request indicates the URL)

    In the description of the CEF format, the dictionary is given in full, indicating the type of data for each key and the maximum permissible size.

    At the end of the document are some more general requirements. In particular, the entire message must use UTF-8 encoding. The rules for the design of some special characters and multiline entries are also indicated there.

    As a result, the message formatted in exact accordance with the CEF format should look something like this:
    May 29 15:26:33 host CEF:0|McAfee|Antivirus|5.2|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232

    What is the benefit?


    1. Connecting a new event source turns into a “plug and play”
    2. The load on the SIEM system itself is reduced
    3. SIEM solution manufacturers no longer measure by the number of supported sources and pay more attention to other aspects (product usability, fault tolerance, etc. .)
    4. And in general, this format was invented not only for SIEM. It is already used in other solutions related to log processing.

    Also popular now: