Quickly discover supported SNMP device MIB modules

    When implementing systems for monitoring and managing IT infrastructure, one often has to deal with “non-standard” devices. Often, such a device is probably only known for the fact that it supports SNMP. You will have to start connecting it to the project by answering the question of what information about itself it provides. Usually, a complete survey of the device is carried out for this, and the data obtained is analyzed to identify useful information ... But here, as they say, there are nuances. In this note, I’ll talk about one such thing - the algorithm we developed to quickly determine the “supported” device MIB modules .

    Why is this necessary?


    There can be quite a lot of data on the device. For example, the Cisco 2600 router, with which I experimented a bit while writing this article, produces more than 12 thousand values.
    Cisco size
    And, I must say, this is far from the limit.

    In this regard, a couple of problems arise: how to find useful / necessary data in this pile of information wealth, and how to do it in a reasonable amount of time?

    The solution of the first lies on the surface: in accordance with the ancient strategy of “divide and conquer”, it is necessary to divide all the collected values ​​into a relatively small number of categories, each of which is evaluated for utility. The answer to the question "how and on which categories to break the data?" in this case, it is also quite obvious - all (well, or almost all) of the data up to us has been sorted into MIB modules, where they usually add descriptions of data elements (variables) that are logically interconnected.

    SNMP Assets
    The standard way to split SNMP data in AggreGate Network Manager is by MIB modules.

    Thus, the subtask of extracting the necessary information is reduced to mapping the received data to MIB modules (communication is carried out by the variable identifier - OID ).

    There are many tools that do something similar. And all of them (at least, all known to us) perform a complete survey of the device.

    So, for example, the MIB Walk utility is running as part of SolarWinds Engineer's Toolset, which polls the same "pussy" from my computer for 3.5 - 4 minutes. It seems that this is not so much. But we must take into account that this is not the “biggest” device, and that it is available to me over a lightly loaded local network. In the conditions of a real “combat” project, where there is serious traffic, and the device is on a different network, the time of a complete survey can grow by orders of magnitude. And while such a survey is in progress, the specialist, whose job it is to connect the device, will be distracted one way or another, which is called “lose context”, and here you will have to add time to “return to the task” (which often turns out to be a significant add-on). It should also be taken into account that there are often many such devices in a serious project - in some cases we had to study 2–3 dozens of devices. In the end, a significant amount runs up.

    One way or another, but our own experts in implementing monitoring systems, and users who independently configure the system, at some point often began to mention the expectation of the completion of a complete survey of SNMP devices as one of the factors that significantly slowed down the work. And I had to invent a way to reduce the time of useless waiting. As a result, we came up with and implemented the following algorithm in our system.

    Algorithm for quick detection of MIB modules available on an SNMP device


    A good statement of the problem is half the solution. We can describe the problem as follows.
    A list of MIB modules and an SNMP device are given.
    It is necessary to determine whether each of these MIB modules is “supported” by this device.
    This wording immediately raises the question: What does “MIB module supported by the device” mean ?
    A MIB module is a description of a set of SNMP variables. In light of this, the following answer to the question sounds logical: we will assume that the MIB module is supported if at least one of the variables described in it is present on the device .
    Note : There is little difficulty: the same value can be described in different MIBs. We will take this into account below.

    The optimization idea directly follows from this definition : if we find on a device one variable from a certain MIB module, the remaining variables of this module can be excluded from the survey. Since the variables of the MIB module most often go in rather large blocks, we can not only noticeably, but, as practice shows, radically reduce the amount of data that we will need to get from the device. Due to this, the polling time will also decrease.

    We get the following algorithm :
    • First, we will create lists of OIDs described in each MIB of our library. For each OID, remember which MIB it belongs to (there may be several, remember?), And merge these lists into a single set, sorting the OIDs in lexicographical order
    • Now, having received the next variable from the device using GET_NEXT and defining the MIB modules to which it belongs, we not only “include” these modules in the list of supported ones, but we can, in good conscience, remove from the list all OIDs belonging to (only) these MIB modules s.
    • The next GET_NEXT we already do for the first variable remaining in the poll list.


    Thus, we do not “walk” (WALK) the device, but literally rush about it in big leaps.

    Keeping in mind the high “grouping” of OIDs in the MIB module, you can slightly improve the algorithm by preliminary “thinning out” the initial list of OIDs: if some sequence of OIDs belongs to one MIB module (or, in a more general case, to one set of MIB modules), then it makes no sense to check all of them - the GET_NEXT request to the first of them will in any case give either one of this group or show that this data block is absent on the device.

    results


    The figures show the result of detecting MIBs on the Cisco router mentioned above.
    The beginning of the list of detected modules:
    Cisco MIBs 1

    And this is its last page:
    Cisco MIBs 2

    As you can see, 64 MIB modules were found . By the way, the running time of the algorithm: 1-2 seconds.

    The following screenshot shows the result of detection on a “non-standard” Hirschmann Railswitch RSB20 device.
    Hirschmann mibs
    The last two entries represent the “custom” MIB modules that ship with this device.

    The “live” process of detecting MIB modules on Hirschmann can be seen in our video about connecting non-standard devices (gourmets may be interested in the English version) True, all the magic with MIBs remains behind the scenes and fits into a two-three second interval, but our approach to working with SNMP devices will become clear.

    Conclusion


    The fast detection algorithm for the MIB modules supported by the device was implemented in the AggreGate SNMP driver . At the moment, it has been debugged and has been working steadily in various projects for monitoring IT infrastructure at various levels for several years. Over the past year, no errors have been identified in it, which suggests, at least, that the idea is correct. Until now, inaccuracies were encountered from time to time, but 99% of them were associated with variously incorrect implementations of the agent on devices in terms of SNMP specifications. But the client is always right, I had to make amendments to the driver that take into account such "features"; This concerned the implementation of this algorithm.

    Also popular now: