animage September 29, 2011 at 19:21

DIY Loopdetect

From the sandbox

The essence of the problem

One of the worst scourges of an ethernet network is the so-called loop. They arise when (mainly due to the human factor) a ring forms in the network topology. For example, two switch ports were connected with a patch cord (it often happens when two switches are replaced by one and without looking they stick everything that was) or they started the node on a new line and forgot to disconnect the old one (the consequences can be sad and difficult to identify). As a result of such a loop, packets begin to multiply, switching tables get lost, and an avalanche-like traffic growth begins. In such conditions, network equipment freezes and a complete disruption of the network.

In addition to these loops, it is not uncommon that when a port (switch or network card) burns out, it begins to return received packets back to the network, and most often the connection is negotiated at 10M, and the link goes up even when the cable is disconnected. When there is only one port in a segment, the consequences may not be so deplorable, but they are still very sensitive (users of Vista and Seven suffer especially hard). In any case, with such things you need to mercilessly fight and understand the fact that by intentionally or accidentally creating a loop, albeit for a short period of time, you can disconnect an entire network segment.

Materiel

Fortunately, most modern managed switches, in one form or another, have loop detection functions (loopdetect, stp), and even more, the stp protocol family allows you to specifically build a ring topology (to increase fault tolerance and reliability). But there is a flip side to the coin, it often happens that one burnt port can leave an entire area without communication. Or, say, at the same stp, topology is not immediately rebuilt, the connection at this point, of course, leaves much to be desired. In addition, some manufacturers are very negligent in the implementation of loop detection protocols, say DES-3016 (glink) cannot determine a loop at all if you simply connect its two ports.

Identification Principles

The principle of loop detection (loopdetect) is quite simple. A special packet is sent to the network with a broadcast address (intended for everyone) and if it goes back, we believe that the network behind this interface is looped back. What to do next depends on the type of equipment and settings. Most often, the port is completely or partially (in a separate vlan) blocked, the event is logged, snmp traps are sent. This is where system administrators and the emergency service come in.

If the entire network is managed, then identifying and eliminating the loop is not difficult. But there are not so few networks where a chain of 5-6 unmanaged switches is connected to one port. Removing such a loop can take a lot of time and effort. The search process boils down to sequentially disabling (enabling) the ports. To determine the presence of a loop, either a superior managed switch or some sniffer (wireshark, tcpdump) is used. The first method is very dangerous due to the delay between turning the lock on and off, in the best case, users will simply have lags, and in the worst case, a loopdetect will work higher in the line and a much larger segment will fall off. In the second case, there is no danger for users, but it’s much more difficult to determine the presence of a loop (especially in a small segment where there is little broadcast traffic),

Do it yourself

As mentioned above, the hardware implementation of loop search is more than enough. So without hesitation, I turn on the wireshark, configure the filter and see what the switch does and how. Actually, everything is simple: an ethernet packet is sent to the port with the destination address cf: 00: 00: 00: 00: 00 , type 0x9000 ( CTP ) and with an unknown function number 256 (only two are described in the documentation I found). The destination address is a broadcast one, so if there are loops in the network, several copies of this packet should go back.

First I decided on the libraries:

To capture and send raw packets I will use the pcapy library;
With package generation dpkt will help me;
To play sound, I'll use pyaudeo and wave;
Well, a few standard libraries.

Further, everything is easy and simple. I create an instance of the pcapy.open_live class with the selected interface and add a filter to it. I create the first loop, which will periodically send a packet, and inside it a second one, which would capture and process returned packets. If the captured packet is identical to the sent one, then +1 is added to the counter. If after the timeout expires more than one copy of the packet is received, a sound is played, and a loop message is displayed on the console.

The resulting script can be found later.

import pcapy, dpkt , sys
import time , random, socket
import pyaudio , wave
def packetBody(length):
    rez = []
    for x in range(0,length):
        rez.append(random.choice('0123456789abcdef') + random.choice('0123456789abcdef'))
    return rez
class loopDetector:
    packetCount = 0
    loopCount = 0
    timeout = 1
    def __init__(self,iface):
        self.iface = iface
        self.pcaper = pcapy.open_live(iface,100,1,500)
        self.Mac = '00:19:5b:'+':'.join(packetBody(3))
        self.pcaper.setfilter('ether dst cf:00:00:00:00:00 and ether src %s' % self.Mac)
        wf = wave.open('alarm.wav', 'rb')
        self.pyA = pyaudio.PyAudio()
        self.stream = self.pyA.open(format =
                self.pyA.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = wf.getframerate(),
                output = True)
        self.wfData = wf.readframes(100000)
        wf.close()
    def __del__(self):
        self.stream.stop_stream()
        self.stream.close()
        self.pyA.terminate()
    def PlayAlarm(self):
        self.stream.write(self.wfData)
    def Capture(self,hdr,data):
        if data == str(self.sPkt):
            self.packetReceived += 1
    def Process(self):
        while 1:
            try:
                pktData = '00000001' + ''.join(packetBody(42))
                self.sPkt = dpkt.ethernet.Ethernet(dst="cf0000000000".decode('hex'),
                                              src=''.join(self.Mac.split(':')).decode('hex'),
                                              type=36864,data=pktData.decode('hex'))
                endTime = time.time() + self.timeout
                print "Send packet to %s" % self.iface
                self.packetCount += 1
                self.pcaper.sendpacket(str(self.sPkt))
                self.packetReceived = 0
                while time.time() < endTime:
                    try:
                        self.pcaper.dispatch(-1,self.Capture)
                    except socket.timeout:
                        pass
                if self.packetReceived > 1:
                    self.loopCount += 1
                    print "Loop Detected. Duplication found %s" % self.packetReceived
                    self.PlayAlarm()
            except KeyboardInterrupt:
                break
        print "Packets sent: ", self.packetCount , "Loops discovered : " , self.loopCount
def main():
    dev_list = {}
    n = 0
    iface = ''
    for x in pcapy.findalldevs():
        dev_list[n] = x
        n += 1
    try:
        iface = dev_list[0]
    except KeyError:
        print "No device found"
        exit(1)
    if len(sys.argv) == 2:
        try:
            if sys.argv[1] in  ['list','ls','all']:
                for x in dev_list:
                    print 'Index:', x, 'Device name:' ,dev_list[x]
                return 0
            else:
                iface = dev_list[int(sys.argv[1])]
        except KeyError:
            print "Invalid device id, trying use first"
            iface = dev_list[0]
    ld = loopDetector(iface)
    ld.Process()
if __name__ == "__main__":
    main()

Link to the original and source

Tags:

DIY Loopdetect

The essence of the problem

Materiel

Identification Principles

Do it yourself

Also popular now: