Maximum Transmission Unit (MTU). Myths and Reefs

    Maximum transmission unit (MTU) is the maximum amount of data that can be transmitted by the protocol in one iteration. For example, the Ethernet MTU is equal to 1500, which means that the maximum amount of data carried by the Ethernet frame cannot exceed 1500 bytes (without taking into account the Ethernet header and FCS - Fig. 1).

    image
    Fig. 1

    Let's go over the OSI layers with MTU:

    Layer 2.


    Ethernet MTU is a special case of Hardware MTU. The definition of Hardware MTU follows from the general definition:
    Hardware MTU is the maximum packet size that can be transmitted by the interface in one iteration (at least the value is specified in the specifications of the device - in fact, some chipsets support the transfer of larger packet sizes than stated). Therefore, if you look at Figure 1 in isolation from the Ethernet, we get the following:
    image
    Fig. 2

    Note:However, here you can not do without a reservation. As you can see, the HW MTU (Ethernet MTU in particular) does not include the L2 header. However, this is true for IOS and IOS XE, but for IOS XR and JunOS, the L2 header is included in the HW MTU size - Fig. 3. This feature can lead to problems when installing OSPF neighborship between platforms running IOS (XE) and IOS XR (OSPF requires matching MTU in Hello packets). Therefore, when configuring MTU for Ethernet interfaces, the IOS XR MTU should be 14 bytes larger (12 bytes src mac + dst mac and 2 bytes EtherType). For example, an MTU of 1500 in Cisco IOS is equivalent to an MTU of 1514 for IOS XR.

    image
    Fig. 3

    Configuration and verification.

    In order to change the MTU on routers running Cisco IOS, use the level interface command:
    R01(config)#interface gigabitEthernet 5/1 
    R01(config-if)#mtu 1532
    R01(config-if)#exit
    

    We check:
    R01#show interfaces gigabitEthernet 5/1
    GigabitEthernet5/1 is up, line protocol is up (connected)
      Hardware is C6k 1000Mb 802.3, address is 0008.e3ff.fde0 (bia 0008.e3ff.fde0)
      Description: -- --
      MTU 1532 bytes, BW 1000000 Kbit, DLY 10 usec, 
         reliability 255/255, txload 82/255, rxload 20/255
      Encapsulation ARPA, loopback not set
      Keepalive set (10 sec)
      Full-duplex, 1000Mb/s, media type is LH
    ..... OUTPUT OMITTED
    

    AND
    R01#show run interface gigabitEthernet 5/1
    interface GigabitEthernet 5/1
     description -- --
     no switchport
     mtu 1532
     ip address 192.168.1.1 255.255.255.0
    end
    


    Layer3.


    IP MTU determines the maximum packet size with an IP header that can be transmitted on this interface without resorting to fragmentation. The relationship between the IP MTU and the HW MTU is described by the following formula:
    IP MTU ≤ HW MTU
    Accordingly, when a packet exceeds the installed IP MTU arrives at the interface, the packet is either fragmented or, in the case of the DF (DO NOT Fragment) flag set in the IP header, is discarded, and the device may generate the ICMP Fragmentation Needed message used in the path MTU discovery mechanism (about it later), and send it back to the sender of the original packet.

    Configuration and verification.

    To change the IP MTU on routers running Cisco IOS, use the level interface command:
    R01(config)#interface gigabitEthernet 5/1 
    R01(config-if)#ip mtu 1532
    R01(config-if)#exit

    We check:
    show interfaces gigabitEthernet 5/1
      GigabitEthernet 5/1is up, line protocol is up
      Internet address is 192.168.1.1/24
      Broadcast address is 255.255.255.255
      Address determined by non-volatile memory
      MTU is 1532 bytes
      Helper address is not set
      Directed broadcast forwarding is disabled
      Multicast reserved groups joined: 224.0.0.5 224.0.0.2
      Outgoing access list is not set
      Inbound  access list is not set
    ..... OUTPUT OMITTED
    

    AND
    R01#show run interface gigabitEthernet 5/1
    interface GigabitEthernet 5/1
     description -- --
     no switchport
     mtu 1532
     ip address 192.168.1.1 255.255.255.0
     no ip redirects
     no ip unreachables
     no ip proxy-arp
    end
    


    Here are those times. The ip mtu command is not visible in show run. Yes, there is an interesting caveat - if ip mtu matches hw mtu, then only hw mtu will be displayed in the show run output. If the values ​​are different then both are displayed.

    Layer 4.


    TCP Maximum Segment Size (MSS) defines the maximum size of a TCP segment ( without a TCP header! ) That can be used (sent / received) during a TCP session. An announcement (namely, an announcement, not a handshake) of TCP MSS sizes occurs during the setup of a TCP session - the receiving party announces to the sending party what size TCP segment it can accept. Accordingly, the TCP MSS size may vary within the same TCP session depending on the direction.

    image
    Fig. 4 The

    party making the announcement calculates the TCP MSS value for itself using the following formula:
    TCM MSS = (IP MTU – [IPHDR + TCPHDR])

    Configuration.

    Here we have two possible scenarios - the router is a transit or participant in a TCP session.
    1) Transit device:
    To prevent packet dropping by the intermediate device in the case of a link with a small MTU, the router will listen to TCP SYN packets and replace the MSS values ​​announced by the end device. What will lead to sending packets of a smaller size to the end device and voila - the problem with drops on the link with a small MTU is anticipated.
    R01(config)#interface gigabitEthernet 5/1 
    R01(config-if)#ip tcp adjust-mss?
    <500-1460>  Maximum segment size in bytes
    

    2) Termination device:
    Everything is simple here - the router is a participant in the TCP session and we can force the size of the MSS that it will announce.
    R01(config)#ip tcp mss?
    <0-10000>  MSS
    


    Everything seems to be? No, not all. We recall MPLS. We recall ... We finished recalling, we pass to consideration.

    Layer 2.5. MPLS



    image
    Fig. 5

    MPLS MTU determines the maximum size of a labeled (who knows how best to translate Labeled please suggest in the comments) IP packet. If the size of the marked packet exceeds MPLS MTU, the packet is either fragmented or, if the flag header is set to IP with DF bit, drops (while the logic is the same as when the IP MTU is exceeded), with the possible sending of an ICMP Fragmentation Needed message.

    Comment:Here things are a little different compared to IP MTU. In an MPLS network, the intermediate node may not have a route to the sender of the packet, so instead of sending an ICMP message directly to the sender, it is encapsulated with the same label stack as the original packet and sent along its own path. Reaching Egress LSR (the final MPLS router for this LSP - it already has an untagged IP network), which knows ip routes to the source host, the Fragmentation Needed ICMP message is “deployed” to it, encapsulated with the necessary headers and sent back to the MPLS network to the sender of the original packet . The behavior is similar to TTL Expired, and generally refers to the topic of MPLS rather than MTU. Therefore, who is not familiar with the process - www.google.kg/?gws_rd=ssl#q=mpls+ttl+expired

    What else is interesting here? The MPLS MTU may be larger than the HW MTU (therefore, in Figure 3, the HW MTU is partially indicated by a dotted line). At the same time, IOS will issue warning, but in most cases it will work (depending on the chipset of the interface) and successfully skip at least baby-giant frames. And sometimes you can get drop packages, data corruption, and a hundred years without a crop.

    Configuration and verification.

    R01(config)#interface gigabitEthernet 5/1 
    R01(config-if)#mpls mtu 1540
    R01(config-if)#exit
    

    We check:
    R01#show mpls interfaces gigabitEthernet 5/1 detail 
    Interface gigabitEthernet 5/1:
            IP labeling enabled (ldp):
              Interface config
            LSP Tunnel labeling not enabled
            BGP labeling not enabled
            MPLS operational
            MTU = 1540
    

    Note: MPLS MTU is displayed in the running config, as well as IP MTU - only if the value is different from the HW MTU. But, unlike IP MTU, any change in the HW MTU changes the MPLS MTU value to the HW MTU value (IP MTU does not change this action).

    MTU on Cisco switches.


    The switches do not support setting MTU on each interface separately (we are talking about switchport and Vlan interfaces, for multilayer switches with routed ports the settings are similar to routers). You can change the current MTU settings for switch ports in 3 ways, depending on the type port:
    • SW01(config)#system mtu 1600 - change L2 MTU on FastEthernet ports
    • SW01(config)#system mtu jumbo 1600 - change L2 MTU on GigabitEthernet and Ten GigabitEthernet ports
    • SW01(config)#system mtu routing 1600 - change L3 MTU on routable interfaces

    We check:
    SW01#show system mtu
    System MTU size is 1600 bytes
    System Jumbo MTU size is 1600 bytes
    Routing MTU size is 1600 bytes
    


    Note to the administrator.


    Since the main method of checking MTU to this day is the PING command, with df-bit set and the package size, I’ll conclude with a couple of useful tricks:
    1) In order to find the minimum MTU (funny combination) on the network, you can use the extended ping command, both from end stations / servers and from Cisco equipment. From the R01 router we ping the router R02 with the df-bit set, with the initial packet size of 1000 bytes, the final 1500 bytes, and the increment of 100 bytes. Reps 2.
    R01#ping
    Protocol [ip]:
    Target IP address: 192.168.12.2
    Repeat count [5]: 2
    Datagram size [100]:
    Timeout in seconds [2]:
    Extended commands [n]: y
    Source address or interface: 192.168.12.1
    Type of service [0]:
    Set DF bit in IP header? [no]: y
    Validate reply data? [no]:
    Data pattern [0xABCD]:
    Loose, Strict, Record, Timestamp, Verbose[none]:
    Sweep range of sizes [n]: y
    Sweep min size [36]: 1000
    Sweep max size [18024]: 1500
    Sweep interval [1]: 100
    Type escape sequence to abort.
    Sending 12, [1000..1500]-byte ICMP Echos to 192.168.12.2, timeout is 2 seconds:
    Packet sent with a source address of 192.168.12.1
    Packet sent with the DF bit set
    !!!!..!!!!..
    Success rate is 66 percent (8/12), round-trip min/avg/max = 4/24/56 ms
    


    As you can see, only 6 ICMP packets pass through with sizes of 1000, 1100, 1200, 1300 bytes.
    Starting from 1400 bytes and above, packets fail. Consequently, the minimum MTU between two points is 1300 and 1400, which can be clarified in a few more cycles, squeezing the range and knowing the step.


    2) A common problem that arises during the interaction of network and system administrators is that packets of the same size pass from the end device, and larger network devices from the nearest device. The reason is that operating systems (in particular Windows), when you set the packet size to the ping command, perceive this value as a pure paiload - without ICMP and IP headers, i.e. if ping 192.168.1.2 -l 100 is specified, the system will generate packets of 128 bytes rather than 100 (8 bytes ICMP header and 20 bytes IP). When specifying the size of the ICMP packet on Cisco network equipment, the size you specify already includes both headers. Therefore, on a default Ethernet link, pings with Windows OS (for example) will show 1472 bytes, the maximum packet size passing without fragmentation, and Cisco 1500 bytes. JunOS,


    That's all. There is also an old draft article in the bins on frame sizes and their evolution, which describes the concepts of Jumbo Frame, Baby-Giant Frame, which are found in this article. If you consider it necessary, I can modify and lay it out.

    Also popular now: