Stroking a lizard or network load testing with cisco trex



    It is customary to somehow bypass the topic of load testing of network equipment, it is usually mentioned casually in the context of terribly expensive specialized iron. I did not find information about this open-source product in Russian, so I allow myself to be slightly popularized. In the article I will describe a small HOWTO in order to introduce people to software traffic generators.

    Cisco TREX is a high-performance traffic generator. For its work uses dpdk. Hardware requirements - 64-bit architecture, compatible network card, supported by * Fedora 18-20, 64-bit kernel (not 32-bit) * Ubuntu 14.04.1 LTS, 64-bit kernel (not 32-bit). You can run on another Linux, having washed down the necessary drivers and compiling your version from the files that are in the repositoryon github, everything is standard here.

    DPDK


    The Data Plane Development Kit (DPDK), originally developed by Intel and transferred to the open community.
    DPDK is a framework that provides a set of libraries and drivers to speed up packet processing in applications running on Intel architecture. DPDK is supported on any Intel processor from Atom to Xeon, of any capacity and without limiting the number of cores and processors. Currently, DPDK is also ported to other architectures other than x86 - IBM Power 8, ARM, etc.
    If you do not go into technical details, DPDK allows you to completely exclude the Linux network stack from packet processing. An application running in User Space communicates directly with hardware.
    In addition to supporting physical cards, it is possible to work with paravirtualized VMware cards (VMXNET /
    VMXNET3, Connect using VMware vSwitch) and E1000 (VMware / KVM / VirtualBox).

    Deploy


    Download, unpack, collect trex.
    WEB_URL=http://trex-tgn.cisco.com/trex # или csi-wiki-01:8181/trex (Cisco internal)
    mkdir trex
    cd trex
    wget --no-cache $WEB_URL/release/v2.05.tar.gz
    tar -xzvf v2.05.tar.gz
    cd v2.05
    cd ko/src  
    make  
    make install  
    cd -  
    


    The interfaces from which testing will be performed must be pulled out of Linux and transferred under the control of dpdk, for this it is necessary to execute a command that displays the PCI id of all interfaces.
     $>sudo ./dpdk_setup_ports.py --s
     Network devices using DPDK-compatible driver
     ============================================
     Network devices using kernel driver
     ===================================
     0000:02:00.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth2 drv=e1000 unused=igb_uio *Active*
     0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #1
     0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #2
     0000:13:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #3
     0000:13:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #4
     Other network devices
     =====================
    

    Then add them to the configuration file, it is recommended to copy, because then trex will automatically pick it up, without having to specify a manual path each time it starts.
    cp  cfg/simple_cfg.yaml /etc/trex_cfg.yaml
    

    As you can see, the configuration is stored in the now popular YAML format , looking ahead, it also stores test configuration files. It is also recommended to set the poppy address.
    Just in case, an example of how the file should look:
            - port_limit      : 2
              version       : 2
              interfaces    : ["03:00.0","03:00.1"]   2
              port_info       :  # set eh mac addr
                      - dest_mac        :   [0x1,0x0,0x0,0x1,0x0,0x00]  # port 0
                        src_mac         :   [0x2,0x0,0x0,0x2,0x0,0x00]                                     1
                      - dest_mac        :   [0x2,0x0,0x0,0x2,0x0,0x00]  # port 1                           1
                        src_mac         :   [0x1,0x0,0x0,0x1,0x0,0x00
    

    port 0 - src
    port 1 - dst

    Let's load something already


    Physical interfaces must be connected somewhere according to the input-output scheme. One interface will send packets to another one (in fact, the generator can emulate full honest tcp sessions, but not about that now)

    The next command will run a test that will load the dns iron with a request in the same direction for 100 seconds, by the way, if If you want the template to work on all interfaces (this package went in both directions), you can add the -p switch
    sudo ./t-rex-64 -f cap2/dns.yaml -c 4 -m 1 -d 100 -l 1000
    

    -c is the number of processor cores.
    -m is the cps multiplier of each packet template.
    -d is the test time.
    -l - frequency (in Hz) of latency packets, many parameters are considered without taking them into account.

    In this case, the output will contain something like this (slightly stinging, choosing the most interesting)
     -Global stats enabled
     Cpu Utilization : 0.0  %  29.7 Gb/core 
     Platform_factor : 1.0
     Total-Tx        :     867.89 Kbps                                             
     Total-Rx        :     867.86 Kbps                                             
     Total-PPS       :       1.64 Kpps
     Total-CPS       :       0.50  cps
     Expected-PPS    :       2.00  pps   9
     Expected-CPS    :       1.00  cps   10
     Expected-BPS    :       1.36 Kbps   11
     Active-flows    :        0 6 Clients :      510   Socket-util  : 0.0000 %
     Open-flows      :        1 7 Servers :      254   Socket   :        1  Socket/Clients :  0.0
     drop-rate       :       0.00  bps   
     current time    : 5.3 sec
     test duration   : 94.7 sec
    

    Cpu Utilization - average value of CPU load by transmitting threads. For best performance, it is recommended to keep less than 80%.
    Total-Tx - total speed on the transmitting interface (in this case port 0)
    Total-Rx - total speed on the receiving interface (in this case port 1)
    Total-PPS - packets per second number of packets on the
    Total-CPS interfaces - connections per second in fact, this parameter means the number of start patterns that are specified in the configuration file per second.

    Expected-PPS - The expected number of packets per second, in theory, tends to cps * the number of packets of the template.
    Expected-CPS - cps specified in the yaml test file.
    Expected-BPS - total traffic, template size * cps.

    Active-flows - the number of internal t-rex threads. Essentially, this parameter is the number of sessions t-rex is following. for example, if you run a test with pcap for a session duration of 30 seconds, then this indicator should aim at 30 * Expected-CPS.

    If you want to really “load” the network, you can increase the pattern factor and add -p.
    sudo ./t-rex-64 -f cap2/dns.yaml -c 4 -m 9000 -d 100 -l 1000 -p
    

    The number of streams with the same IP will be increased, if the diversity of traffic (src addresses) is critical, then it is necessary to increase the CPS in the configuration file, by the way, about it.

    Test configurations


    Consider cap2 / dns.yaml:
    - duration : 10.0
      generator :  
              distribution : "seq"
              clients_start : "16.0.0.1"
              clients_end   : "16.0.1.255"
              clients_end : "48.0.0.1"
              servers_end   : "48.0.0.255"
              clients_per_gb : 201
              min_clients    : 101
              dual_port_mask : "1.0.0.0" 
              tcp_aging      : 1
              udp_aging      : 1
      mac        : [0x00,0x00,0x00,0x01,0x00,0x00]
      #vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 }
      #mac_override_by_ip : true
      cap_info : 
         - name: cap2/dns.pcap
           cps : 1.0
           ipg : 10000
           rtt : 10000
           w   : 1
    

    clients_start - clients_end - range of rsc addresses.
    clients_start - clients_end - range of dst addresses.

    - name: cap2 / dns.pcap - set the pcap file to be used as a shablon.
    cps - the number of connections per second, essentially equal to the number of concurrently launched threads from your template. Those. if in the test you have an address incrementing, and cps: 10 then 10 threads with different addresses will be launched simultaneously.
    ipg - should be the same as rtt.

    In general, the logic of a tirex looks like this: it goes through the entire range of IP addresses, changing dst and src addresses at each iteration, when they end, the cycle repeats with the port increment and so on 64k times.

    Testing NAT


    Serious tsiska guys implemented a very important function, they allowed the generator to create honest tcp sessions and monitor them. For example, if there is NAT between our interfaces, then we can say “we have nat” and traffic will be taken into account with broadcast detection.
    There are three modes in total:
    mode 1 This mode works only on TCP. We look at the ACK that comes for the first SYN and thus learn. This is a good honest regime.
    mode 2 Works with IP option.
    mode 3 Works as mode 1, but does not teach the Sequence Number on the server to the client side. May give an increase in cps relative to the first mode.

    sudo ./t-rex-64 -f cap2/http_simple.yaml -c 4  -l 1000 -d 100000 -m 30  --learn-mode 1
    -Global stats enabled 
     Cpu Utilization : 0.1  %  13.4 Gb/core 
     Platform_factor : 1.0  
     Total-Tx        :      24.12 Mbps   Nat_time_out    :        0 
     Total-Rx        :      24.09 Mbps   Nat_no_fid      :        0 
     Total-PPS       :       5.08 Kpps   Total_nat_active:        1 
     Total-CPS       :      83.31  cps   Total_nat_open  :     1508 
     Expected-PPS    :       3.08 Kpps  
     Expected-CPS    :      83.28  cps  
     Expected-BPS    :      22.94 Mbps  
     Active-flows    :       11  Clients :      252   Socket-util : 0.0001 %    
     Open-flows      :     1508  Servers :    65532   Socket :       11 Socket/Clients :  0.0 
     drop-rate       :       0.00  bps   
     current time    : 18.7 sec  
     test duration   : 99981.3 sec  
    

    Nat_time_out - there should be zero, the number of threads that the tirex could not keep track of for some reason, usually happens if packets drop somewhere.
    Nat_no_fid - should be zero, usually occurs when too long timeouts inside the equipment under test.
    Total_nat_active: the active number of threads should be low at low rtt.
    Total_nat_open: total number of threads, may differ in uni-directional pattern.

    In fact, there is another important parameter that we did not specify - l-pkt-mode needs this thing to indicate the type of packets with which we measure latency, the -l switch refers to it, by the way, they are not taken into account anywhere except the output latency i.e. open-flows type parameters should not be affected.
    0(default) SCTP packets;
    1 ICMP on both sides;
    2 Stateful, sends ICMP on the one hand and matches them on the other. This option makes sense if your equipment drops packages from the outside;
    3 sends ICMP packets always with 0 sequence number.

    The end.


    If there is interest next time I will tell about changes in version 2.06.
    I strongly recommend considering this generator for testing your projects, it is unpretentious, affordable, and, most importantly, open source.

    Sources


    trex-tgn.cisco.com/trex/doc
    sdnblog.ru/what-is-intel-dpdk
    github.com/cisco-system-traffic-generator/trex-core

    Also popular now: