Protecting Your Router: QoS

    QoS is a big topic. Before talking about the subtleties of settings and various approaches to the application of traffic processing rules, it makes sense to recall what QoS is.

    Quality of Service (QoS) is a technology for providing various classes of traffic with various service priorities.

    Firstly, it is easy to understand that any prioritization makes sense only when a service queue arises. It is there, in line, you can "slip" first, using your right.
    The queue is formed where it is narrow (usually such places are called "bottlenecks", bottle-neck). A typical “neck” is access to the Internet office, where computers connected to the network at least at a speed of 100 Mbit / s all use a channel to the provider, which rarely exceeds 100 Mbit / s, and often amounts to minus 1-2-10 Mbit / s . For everyone.

    Secondly, QoS is not a panacea: if the “neck” is too narrow, then the physical buffer of the interface is often full, where all the packages that are going to exit through this interface are placed. And then the newly arrived packages will be destroyed, even if they are overuse. Therefore, if the queue on the interface on average exceeds 20% of its maximum size (on cisco routers, the maximum queue size is usually 128-256 packets), there is reason to think hard about the design of your network, lay additional routes or expand the band to the provider.

    We will deal with the constituent elements of the technology

    (further under the cut, a lot)

    . Marking.In the header fields of various network protocols (Ethernet, IP, ATM, MPLS, etc.) there are special fields dedicated to marking traffic. You need to mark the traffic for subsequent simpler processing in the queues.

    Ethernet The Class of Service (CoS) field is 3 bits. Allows splitting traffic into 8 streams with different

    IP markings . There are 2 standards: old and new. In the old one, there was a ToS field (8 bits), from which 3 bits in turn were called IP Precedence. This field was copied to the CoS Ethernet header field.
    A new standard was later defined. The ToS field was renamed to DiffServ, and an additional 6 bits were allocated for the Differencial Service Code Point (DSCP) field, in which the parameters required for this type of traffic can be transmitted.

    Labeling data is best closer to the source of that data. For this reason, most IP phones add the DSCP = EF or CS5 field to the IP header of voice packets themselves. Many applications also mark traffic on their own in the hope that their packages will be processed first. For example, peer-to-peer networks “sin” this.

    Queues

    Even if we do not use any prioritization technologies, this does not mean that there are no queues. In a bottleneck, a queue will arise in any case and will provide a standard FIFO (First In First Out) mechanism. Such a queue, obviously, will allow not destroying packets immediately, saving them before sending in the buffer, but will not provide any preferences, say, to voice traffic.

    If you want to give some selected class an absolute priority (i.e., packets from this class will always be processed first), then this technology is called Priority queuing . All packets located in the physical outgoing buffer of the interface will be divided into 2 logical queues and packets from the privileged queue will be sent until it is empty. Only then will packets from the second line begin to be transmitted. This technology is simple, rather crude, it can be considered obsolete, because non-priority traffic processing will constantly stop. On cisco routers, you can create
    4 lines with different priorities. They follow a strict hierarchy: packets from less privileged queues will not be served until all queues with a higher priority are empty.

    Just turn ( Fair Queuing ). A technology that allows each class of traffic to provide the same rights. Usually not used, as yields little in terms of improving the quality of service.

    Weighted Fair Queuing ( Weighted Fair Queuing, WFQ ). A technology that provides different classes of traffic with different rights (we can say that the “weight” of different queues is different), but at the same time it serves all the queues. “On the fingers” it looks like this: all packets are divided into logical queues, using in
    as a criterion, the IP Precedence field. The same field sets the priority (the more the better). Further, the router calculates which packet from which queue is “faster” to transmit and transmits it.

    image

    He considers this according to the formula:

    dT = (t (i) -t (0)) / (1 + IPP)

    IPP - value of the IP field Precedence
    t (i) - Time required for the actual transmission of the packet by the interface. It can be calculated as L / Speed, where L is the packet length and Speed ​​is the interface transfer speed.

    This queue is enabled by default on all interfaces of cisco routers, except for point-to-point interfaces (HDLC or PPP encapsulation).

    WFQ has a number of disadvantages: such queuing uses packets that have already been tagged previously, and does not allow you to independently determine the traffic classes and the allocated bandwidth. Moreover, as a rule, no one already marks the IP Precedence field, so the packets go unmarked, i.e. all fall into one queue.

    The development of WFQ was a Class-Based Weighted Fair Queuing (CBWFQ ). In this queue, the administrator sets the traffic classes himself, following various criteria, for example, using ACLs as a template or analyzing protocol headers (see NBAR). Further, for these classes the
    “weight” is determined and the packets of their queues are serviced in proportion to the weight (more weight - more packets from this queue will go per unit time)

    But such a queue does not provide strict transmission of the most important packets (usually voice or packets of other interactive applications). Therefore, a hybrid of Priority and Class-Based Weighted Fair Queuing - PQ-CBWFQ , also known as Low Latency Queuing (LLQ) . In this technology, up to 4 priority queues can be set, the remaining classes serviced by the CBWFQ

    LLQ mechanism are the most convenient, flexible and frequently used mechanism. But it does require class settings, policy settings, and policy enforcement on the interface.

    I’ll tell you more about the settings later.

    Thus, the process of providing quality of service can be divided into 2 stages:
    Labeling . Closer to the sources.
    Packet processing. Putting them in a physical queue on the interface, subdividing them into logical queues and providing various logical resources to these logical queues.

    QoS technology is quite resource-intensive and loads the processor quite significantly. And the more it loads, the deeper you have to get into the headers to classify packages. For comparison: it is much easier for a router to look into the IP packet header and analyze 3 IPP bits there, rather than spin the stream almost to the application level, determining what kind of protocol goes inside (NBAR technology)

    To simplify further traffic processing, as well as to create the so-called “ trusted boundary, where we believe all the headers related to QoS, we can do the following:
    1. On switches and routers of the access level (close to client machines) to catch packets, scatter them into classes
    2. In the policy as an action, repaint the headers in their own way or transfer the values ​​of the QoS headers of a higher level to the lower ones.

    For example, on the router we catch all the packets from the guest WiFi domain (we assume that there may be computers and software not controlled by us that can use non-standard QoS headers), we change any IP headers to default ones, we map the level 3 headers (DSCP) to the channel headers level (CoS),
    so that further and the switches can effectively prioritize traffic using only the link layer label.

    LLQ setup

    Setting up queues is to configure classes, then for these classes you need to determine the bandwidth parameters and apply the entire created structure to the interface.

    Creating classes:

    class-map NAME
    match?

    access-group Access group
    any Any packets
    class-map Class map
    cos IEEE 802.1Q / ISL class of service / user priority values
    destination-address Destination address
    discard-class Discard behavior identifier
    dscp Match DSCP in IP (v4) and IPv6 packets
    flow Flow based QoS parameters
    fr-de Match on Frame-relay DE bit
    fr-dlci Match on fr-dlci
    input-interface Select an input interface to match
    ip IP specific values
    mpls Multi Protocol Label Switching specific values
    not Negate this match result
    packet Layer 3 Packet length
    precedence Match Precedence in IP (v4) and IPv6 packets
    protocol Protocol
    qos-group Qos-group
    source -address Source address
    vlan VLANs to match



    Packets into classes can be sorted by various attributes, for example, specifying ACLs as a template, either by DSCP field, or highlighting a specific protocol (NBAR technology is enabled)

    Creating a policy:

    policy-map POLICY
    class NAME1
    ?

    bandwidth Bandwidth
    compression Activate Compression
    drop Drop all packets
    log Log IPv4 and ARP packets
    netflow-sampler NetFlow action
    police Police
    priority Strict Scheduling Priority for this Class
    queue-limit Queue Max Threshold for Tail Drop
    random-detect Enable Random Early Detection as drop policy
    service- policy Configure Flow Next
    set Set QoS values
    shape Traffic Shaping


    For each class in the policy, you can either select a priority piece of strip:

    policy-map POLICY
    class NAME1
    priority?

    [8-2000000] Kilo Bits per second
    percent % of total bandwidth


    and then packages of this class can always count on at least this piece.

    Or describe what “weight” a given class has within the CBWFQ

    policy-map POLICY
    class NAME1
    bandwidth?

    [8-2000000] Kilo Bits per second
    percent % of total Bandwidth
    remaining % of the remaining bandwidth


    In both cases, you can specify both the absolute value and the percentage of the entire available band

    A reasonable question arises: how does the router know the entire band? The answer is banal: from the bandwidth parameter on the interface. Even if it is not explicitly configured, there must be some value. You can see it with the sh int command.

    It is also necessary to remember that by default you do not manage the entire lane, but only 75%. Packages that do not explicitly fall into other classes fall into class-default. This setting can be set explicitly for the default class

    policy-map POLICY
    class class-default
    bandwidth percent 10


    (UPD, thanks OlegD)
    You can change the maximum available band from the default 75% using the command on the

    max-reserved-bandwidth [percent] interface .

    Routers jealously monitor that the admin does not accidentally give out more bands than there are and swear at such attempts.

    It seems that politics will give classes no more than what is written. However, this situation will only happen if all the lines are full. If some one is empty, then the filled lines allocated to it will be divided in proportion to their “weight”.

    This whole construction will work like this:

    If packets come from a class with priority, then the router focuses on forwarding these packets. Moreover, since There can be several such priority queues, then between them the band is divided in proportion to the indicated percentages.

    As soon as all priority packets have ended, the CBWFQ queue begins. For each countdown, the proportion of packets specified in the settings for this class is “scooped” from each queue. If part of the queues is empty, then their band is divided proportionally to the "weight" of the class between the loaded queues.

    Application on the interface:

    int s0 / 0
    service-policy [input | output] POLICY


    But what to do if you need to strictly cut packages from the class that go beyond the permissible speed? After all, specifying bandwidth only distributes the band between classes when queues are loaded.

    To solve this problem for the traffic class, the policy has the technology

    police [speed] [birst] conform-action [action] exceed-action [action],

    it allows you to explicitly specify the desired average speed (speed), maximum "outlier", ie the amount of data transmitted per unit of time. The larger the “ejection”, the greater the actual transmission rate may deviate from the desired average. Also indicated are: action for normal traffic not exceeding the
    specified speed and action for traffic exceeding the average speed. Actions can be

    police 100000 8000 conform-action?

    drop drop packet
    exceed-action action when rate is within conform and
    conform + exceed burst
    set-clp-transmit set atm clp and send it
    set-discard-class-transmit set discard-class and send it
    set-dscp-transmit set dscp and send it
    set-frde-transmit set FR DE and send it
    set-mpls-exp-imposition-transmit set exp at tag imposition and send it
    set-mpls-exp-topmost-transmit set exp on topmost label and send it
    set-prec -transmit rewrite packet precedence and send it
    set-qos-transmit set qos-group and send it
    transmit transmit packet




    Often there is also another problem. Suppose you want to restrict the flow going towards a neighbor with a slow channel.

    image

    In order to accurately predict which packets will reach the neighbor and which will be destroyed due to the congestion of the channel on the “slow” side, it is necessary to create a policy on the “fast” side that would process queues in advance and destroy redundant packets.

    And here we are faced with one very important thing: to solve this problem, you need to emulate a "slow" channel. For this emulation, it is not enough just to scatter packets in turns, you also need to emulate the physical buffer of the “slow” interface. Each interface has a packet rate. Those. per interface unit, each interface can transmit no more than N packets. Usually, the physical interface buffer is calculated in such a way as to ensure "autonomous" operation of the interface for several units of vermen. Therefore, the physical buffer, say, GigabitEthernet will be tens of times larger than any Serial interface.

    What is wrong with remembering a lot? Let's take a closer look at what happens if the buffer on the fast sending side is significantly larger than the receiving buffer.

    Let for simplicity there is 1 turn. On the “fast” side, we emulate a low bit rate. This means that falling under our policy packages will begin to accumulate in the queue. Because If the physical buffer is large, then the logical queue will be impressive. Some applications (working through TCP) will late receive a notification that some packets have not been received and will keep a large window size for a long time, loading the receiving side. This will happen in the ideal case when the transmission rate is equal to or less than the reception speed. But the host interface can also be loaded with other packages.
    and then a small queue on the receiving side will not be able to accommodate all the packets transmitted to it from the center. Losses will begin that will entail additional transmissions, but in the transmit buffer there will still be a solid “tail” of previously accumulated packets that will be transmitted “idle”, because on the receiving side did not wait for an earlier package, which means that later ones will simply be ignored.

    Therefore, in order to correctly solve the problem of reducing the transmission rate to a slow neighbor, the physical buffer must also be limited.

    This is done with the command

    shape average [speed]

    Well, now the most interesting thing: what about if, in addition to emulating a physical buffer, I need to create logical queues inside it? For example, prioritize voice?

    For this, a so-called nested policy is created, which is applied inside the main one and divides into logical queues what gets into it from the parent.

    The time has come to make out some naughty example based on the above picture.

    Let us plan on creating sustainable voice channels over the Internet between CO and Remote. For simplicity, let the Remote network (172.16.1.0/24) have only a connection with CO (10.0.0.0/8). The interface speed on Remote is 1 Mb / s and 25% of this speed is allocated to voice traffic.

    Then for a start we need to select a priority traffic class on both sides and create a policy for this class. In CO, we additionally create a class that describes traffic between offices

    In CO:

    class-map RTP
    match protocol rtp

    policy-map RTP
    class RTP
    priority percent 25

    ip access-list extended CO_REMOTE
    permit ip 10.0.0.0 0.255.255.255 172.16.1.0 0.0.0.255

    class-map CO_REMOTE
    match access-list CO_REMOTE


    On Remote we will do differently: suppose we don’t we can use NBAR, then we can only explicitly describe the ports for RTP

    ip access-list extended RTP
    permit udp 172.16.1.0 0.0.0.255 range 16384 32768 10.0.0.0 0.255.255.255 range 16384 32768

    class-map RTP
    match access-list RTP

    policy- map QoS
    class RTP
    priority percent 25



    Next, on the CO you need to simulate a slow interface, apply a nested policy to prioritize voice packets

    policy-map QoS
    class CO_REMOTE
    shape average 1,000,000
    service-policy RTP


    and apply the policy on the interface

    int g0 / 0
    service-policy output QoS


    On Remote, set the bandwidth parameter (in kbit / s) in accordance with the interface speed. Let me remind you that it is from this parameter that 25% will be considered. And apply the policy.

    int s0 / 0
    bandwidth 1000
    service-policy output QoS


    The story would not be complete if we did not cover the capabilities of the switches. It is clear that purely L2 switches are not able to look so deep into packets and divide them into classes according to the same criteria.

    On smarter L2 / 3 switches on routable interfaces (i.e., either on the interface vlan, or if the port is removed from the second level with the no switchport command ), the same design applies that works on routers, and if the port or the entire switch works in L2 mode (true for 2950/60 models), then there you can use only the police indication for the traffic class, and priority or bandwidth are not available.

    From a purely protective point of view, knowing the basics of QoS will quickly prevent bottlenecks caused by worms. As you know, the worm itself is quite aggressive in the propagation phase and creates a lot of spurious traffic, i.e. In fact, a Denial of Service (DoS) attack.

    Moreover, the worm often spreads to the ports necessary for operation (TCP / 135,445.80, etc.) Just closing these ports on the router would be rash, so it would be more humane to do so:

    1. Collect statistics on network traffic. Either NetFlow, NBAR, or SNMP.

    2. We identify the profile of normal traffic, ie according to statistics, on average, the HTTP protocol takes no more than 70%, ICMP - no more than 5%, etc. You can either create such a profile manually or by applying statistics accumulated by NBAR. Not only that, you can even automatically create classes, policies and apply them on the interface
    with the autoqos command :)

    3. Next, you can limit the bandwidth for atypical network traffic. If we suddenly catch an infection by a non-standard port, there will be no big trouble for the gateway: on the loaded interface, the infection will take no more than the allocated part.

    4. By creating a construct ( class-map - policy-map - service-policy ), you can quickly respond to the appearance of an atypical burst of traffic by manually creating a class for it and greatly limiting the bandwidth for this class.

    Sergey Fedorov

    Also popular now: