Network overlay technologies: OTV, LISP and results. Part 3

In the previous parts ( first , second ) of this article, we examined the general classification of network overlay technologies for data centers, and also examined in more detail what TRILL, FabricPath, and VXLAN are. It remains the case for OTV and LISP. Plus, you need to take stock.
OTV
OTV (Overlay Transport Virtualization) was invented by Cisco to enable the communication of distributed segments of an L2 network over a regular IP network. With OTV, we can “stretch” a broadcast domain between two or more data centers. We can say that this is one of the L2 VPN implementations that works over an IP network.
OTV implements the concept of routing to MAC addresses. It looks as follows. Devices running OTV technology (hereinafter OTV devices) using the dynamic routing protocol IS-IS exchange data on the MAC addresses of the hosts that are behind them. This allows each OTV device to have a common MAC address routing table. Then, having received a packet whose destination MAC address is a host located somewhere far away (I mean in the remote segment of the L2 network), the OTV device determines from the table where to forward this packet. The header is added to the packet, and it is sent first to the remote OTV device, and then to the recipient's host. When Ethernet packets are transferred between OTV devices, they are encapsulated in UDP datagrams.

I think many have already noted that in general, OTV is very similar to the logic of VXLAN. But unlike VXLAN, OTV technology is designed primarily for the connection of L2 networks between data centers. The second point - we are again talking about IS-IS. Just as in TRILL / FabricPath for control-plane (implementation of the logic of work), this dynamic routing protocol is used.
Let's see how OTV devices find each other, to ensure further communication among themselves. For this, OTV supports two modes of operation.
In the first case, multicast traffic is used to communicate OTV devices with each other. All OTV devices tell the network that they want to receive multicast traffic addressed to a specific group (via the IGMP protocol). Further, messages for establishing and maintaining communication with each other (Hello packets), as well as data on MAC addresses, are transmitted through multicast messages. These messages are sent by the network to all OTV devices. This behavior is very similar to the normal dynamic routing protocol. Only OTV devices can be on different subnets. For all this to work, our network must support the transmission of multicast traffic, including its routing. As noted in a previous article, ensuring this condition is not always easy (for example, if the Internet is used between data centers).
In the case of using unicast traffic for interaction between OTV devices, one of them is configured as an Adjacency Server. Then, on all other OTV devices, its address is indicated (this is done by “pens” during configuration). All OTV devices establish a connection with it. Thus, the neighbor server collects data about all OTV devices (in particular, finds out their IP addresses) and distributes them to all devices. Having received this information, each OTV device can now communicate with others using unicast packets, since now there is data on their IP addresses. After establishing neighboring communications between OTV devices, the exchange of MAC address tables begins. Of course, this mode of operation creates a greater load on OTV devices than in the case of multicast traffic.
To reduce the mutual influence of distributed L2 segments on each other and optimize the transmitted traffic between them, OTV provides their partial isolation:
- No BPDU packets of STP protocols are transmitted between OTV devices. Each segment builds its own independent STP topology.
- Unknown unicast traffic is not transmitted between OTV devices. This logic of work proceeds from the assumption that any device will sooner or later transmit some data and we will find out its MAC address.
- Optimized transmission of ARP messages. On OTV devices, all ARP responses coming from remote hosts are cached. Thus, the local OTV device can respond to an ARP request if someone else has already sent a similar one.
As noted earlier, OTV, unlike VXLAN, provides communication only between data centers. In fact, VXLAN is a more universal protocol. However, the OTV hardware requirements are slightly lower. OTV is often recommended by Cisco in designs between data centers, where FabricPath is used inside.
OTV is supported on the following Cisco equipment: Nexus 7K, ASR 1K, ISR 4451-X.
Lisp
LISP (Cisco Locator / ID Separation Protocol), unlike previously voiced, has a different purpose. Everyone knows the problem of correctly routing traffic if the virtual machine has moved, for example, from one data center to another. And it’s good if the other data center uses the same address space. And if there are completely different subnets, how to tell the client equipment that now the traffic needs to be sent not to DPC1, but to DPC2, and maybe it could be on a different address? One of the solutions to this problem was the separation of the server identifier into two parts: directly the server address (Endpoint Identifier (EID) address) and the address of the device through which the server is accessible (Route Locator (RLOC) address). Thus, when moving a virtual machine to another data center, the EID address will always remain unchanged.

In general terms, the work of LISP can be described as follows. After “turning on” the network of the device where LISP is configured (in terms of LISP - xTR), they inform the dedicated server (Map-Server) about the networks they serve, i.e. for which they are the default gateway. Further, for example, the ROUTER2 device, having received traffic from the client to the server (to the address EID = 1.1.1.2), makes a request for a dedicated device (Map-Resolver), which in turn accesses Map-Server. Map-Server, having determined on its basis the address RLOC1 (address ROUTER1) for the requested EID address, asks ROUTER1 to directly respond to client xTR (ROUTER2) about its address RLOC1. After that, ROUTER2 encapsulates all client packets and sends them to the address RLOC1. Then ROUTER1, having received these packets, unencapsulates them and sends them directly to the server (1.1.1.2).
But what happens if the virtual machine (VM) moves? In this case, after migrating to another host, the virtual switch (running on the host where the VM moved) will send a RARP or GARP message to the network (depending on the type of hypervisor). The purpose of this message is to “tell” the network that the VM is now on the new host. Having intercepted such traffic (in fact, any traffic from the VM is suitable, but usually we see RARP / GARP first), the nearest xTR will send a message about this (that the VM is now behind it) to Map-Server. Map-Server, in turn, will update the database (for the virtual machine address, the EID will replace the RLOC address through which it is available). Map-Server will also inform the old RLOC (in our case ROUTER1) that the virtual machine has left and now it is behind the new RLOC address (for example, RLOC3, not shown in the figure).
The UDP protocol is used to encapsulate traffic, which ensures the transparent passage of packets through a regular IP network.
LISP is supported fairly on a large list of Cisco equipment, for example: ISR G1 and G2, ASR, CRS, Nexus 7K, Catalyst 6K.
General summary
Let us try to summarize the overall results of all three parts of my article. The following table provides a summary of the technologies that we touched on in this material.
| Overlay type | Where is used | Transport | |
|---|---|---|---|
| Fabricpath | Network-based | Inside the data center | FabricPath * |
| TRILL | Network-based | Inside the data center | TRILL * |
| VXLAN | Host-based or Network-based | Both inside the data center and between the data centers | UDP |
| OTV | Network-based | Between Data Centers | UDP |
| Lisp | Network-based | Traffic transfers to VMs that are migrating | UDP |
On Cisco equipment, in my opinion, a combination of FabricPath technologies (inside the data center) and OTV (between data centers) has become quite widespread. These technologies are well-developed, easy to configure, and most of Cisco's designs are based on them. There are examples of data centers where this is used. True, it is worth noting that their use requires not quite simple and cheap hardware. But what can you do.
What about VXLAN? This is a pretty promising technology:
- constantly being improved;
- supported and developed by many vendors;
- support appears on cheaper classes of devices;
- quite flexible: Host-based or Network-based, work both inside the data center and between;
- support for more than 16 million logical segments (VXLAN networks);
- used in SDN networks (in my opinion, a very powerful driver).
Of course, there are a number of issues that still need refinement. For example, if we are talking about Cisco, we would like to see MP-BGP EVPN support for unicast transfer mode on a simpler hardware or the ability to work in unicast mode between several VSMs for the Cisco Nexus 1000V solution.
But technology is constantly evolving. Not so long ago, it was difficult to consider VXLAN to implement communication between data centers. Now the ability to use MP-BGP EVPN to implement a VXLAN control-plane has made this possible. Thanks to MP-BGP EVPN, the issue of minimizing flood and isolation of distributed segments when using VXLAN is solved. Although I would certainly like that there would be no floods and storms at all.
In general, it is worth saying that judging by Cisco documents, the vendor has high hopes for the implementation of MP-BGP EVPN. Cisco offering it as a standard SDN architecture for implementing VXLAN communications as opposed to the Open vSwitch Database Management Protocol (OVSDB). Well, this is not new for Cisco, let's see what happens.
It is worth noting that VXLAN does not participate in the operation of the transport network. Those. the transport network must be configured separately, plus if it is poorly configured, the transmission of VXLAN traffic will not be optimal. True, the "curvature" of the hands is dangerous for any technology.
LISP technology stands apart a bit, solving the problem of optimal traffic transfer towards the server, and in some tasks, the client. Therefore, LISP is recommended to be used regardless of overlay technologies within or between data centers. LISP is not the easiest (from the point of view of implementation) solution to the problem of non-optimal traffic transfer and not for all occasions in life. For example, in order to completely solve the problem of non-optimal traffic from the client to the server in the data center (i.e. WAN-to-DC), LISP must be configured for each remote client (which seems to be quite a difficult task for the Internet network).
And finally, I would like to note a few more points:
- The described technologies are overlay. So they add their header to the source package. Therefore, it is imperative to carefully monitor such a parameter as the maximum frame size (MTU).
- As I already wrote, there is no super technology that solves all problems. Any voiced technology has its pros and cons.
- To simplify the perception of the material, I specifically missed a number of technical details. Therefore, if something raises questions, I will try to answer.