Sustainable channel based on a cluster of cellular modems (SD-WAN): we solve the problems of route selection


    Tests “in the fields”

    There is a commercial task: you need to quickly connect sites to a regular WAN network, but do it where there is only cellular coverage and there is no way to run a cable or organize a radio relay transition to fiber or copper.

    The solution is modem clusters. The obvious solution to the problem is that each of the modems is a separate physical channel. It is necessary with the help of a chisel and some mother to combine them into one encapsulated device, which will simply give out the channel. In addition, it is necessary that when the cable appears it is not necessary to change the box and reconfigure something.

    From whom to choose


    The technology is called SD-WAN. Three American startups are considered leaders: Versa Networks, Viptela, Velocloud. Classic network equipment manufacturers are trying to catch up. In particular, Cisco claims that they have 2 SD-WAN solutions - iWAN and Meraki. But at the same time a couple of months ago they announced the purchase of Viptela. And about a year and a half ago, Riverbed bought Ocedo to enter the SD-WAN solutions market.

    Overall evaluated:

    • Cisco Systems
    • Huawei
    • Nuage Networks,
    • Viptela,
    • VeloCLoud,
    • Silver peak
    • Versa Networks,
    • Citrix
    • InfoVista,
    • Riverbed.

    What did you get


    We settled on the Versa solution; first because of the price advantage. Conventional SDN solutions are used slightly for other tasks, in particular, for combining company branches into one logical network, visible to all terminals and servers as a single physical addressing space, somewhat similar to Cisco DMVPN, but with its blackjack in the form of ZTP, channel -bundling and SLA. The chosen solution turned out to be a little more specific, and due to the lack of fullstack protocols of classical routers on the box itself and the use of standard components, the base cost was reduced. Most vendors offer to buy hardware and software immediately or by subscription, but Versa does not do hardware, so the software is by subscription, and the hardware is from partners who make reference x86 boxes. For customers, the cost model is becoming more convenient every year as it exploits. For example, the smallest box from Versa (pictured above) costs no more than the Cisco 800 Series, but it can pump 500 Mbps through itself. And this is on IMIX traffic, where 90% TCP and 10% UDP and IPv4 Routing / Forwarding, IPSec Encryption, Layer 7 Application based traffic steering, CGNAT, NextgenFirewall (NGFW), QOS (Classification and Marking), SLA monitoring, internal are included service chaining, URL Filtering).


    The concept of SD-WAN

    SDN takes the principle of separation of control and data plane, as well as overlay. The control plane is Director, the controller (BGP route reflector and IPSEC), Analytics is an optional component, but it adds transparency to the services that are used on the WAN.


    Comparison of those two boxes.

    The logic of work inside each brunch box.


    Logical box architecture

    How to install


    The comparison is this: at the end of the year, Cisco will leave the 19xx series in EOS, and it will need to be replaced. That is, physically go around (fly around, go round) physically all points with your feet. Versa allows you to send a box to your place with any delivery service. On the spot in the piece of iron you will need to plug a modem or cable with the Internet. As soon as the piece of iron gets a fresh connection, she herself will build a VXLAN tunnel with a controller, receive an IKE from it and with it already build a service IPSEC tunnel through which she will take the settings and receive the entire network routes from the controllers, because, I remind you, the controller is BGP route-reflector. All this happens completely automatically.

    That is, even an accountant can cope on the spot - stuck a cable, a little magic, it worked.


    Initialization of the box at the new site

    Here is how this procedure looks in the English documentation: 1. Branch device comes with stage – Controller's IP address is the remote IP in IPSec config. 2. Establishes IKE session with controller over VXLAN tunnel. 3. Controller assigns an IP address to the branch device and generates a notification to Versa Director. 4. VD IP address is notified to branch. 5. Branch installs reverse route to VD. 6. VD pushes the post staging configuration to branch device over the IKE session and reboots the branch device.

    The settings themselves are templates, and templates with variables - you can include QOS, shaping, SLA, LAN settings, rules for balancing external channels and bonding these channels into one pipe.

    If there is a great desire, then you can climb onto the box with your hands, the manufacturer has not closed it yet, and there, surprise, is the Juniper-like console.

    Tests



    Prototypes with two modems

    For emulating a real network, where part of the boxes are on the cable, and part for LTE-clusters. 3G-LTE dangles (at the time of the photo it was MTS and Megafon) were stuck in one piece of iron, and in the second cable with the Internet from Gars. Modems are combined into a pipe, information is being transmitted. The piece of iron watches traffic, recognizes it by application, imposes policies and criteria for prioritization.

    After a little dope, Megaphone and MTS (as in the photo) with their locked modems began to be automatically recognized by the boxes. Synthetic iperf-traffic for 5 sessions gave almost 50/50 balancing.


    Director monitoring functions built-in


    Download schedules of two LTE modems in the Director interface

    The total band was somewhere around 50 Mbit, for each of the modems individually - 25 Mbit. Built-in analytics systems right out of the box in rial time produced statistics on the load.

    Empirically, it turned out very successfully with telephony: for example, if there are 15 telephone calls at the same time, then from the first to the eighth we shove in the first channel, and the rest - in the second with a high priority (in the second by default also other office services such as mail).

    The second feature: in places where there are many drops on the last mile due to the peculiarities of network loading or coverage, it was possible to deliver a more or less stable connection.
    Tested per-packet balancing. One TCP session splits into two LTE interfaces. We tested iperf, one TCP session, in the dashboard you can see that it is divided into two channels. That is, it works on synthetic traffic, then it already depends on each specific application, how they behave with such balancing. For example, based on our own experience, we can confirm that broadcasting video on RTSP via VLC works fine. This policy can be applied separately for each service. That is, services that work well with per-packet balancing are balanced by the packet, the rest are per-flow. At the same time, the politicians themselves roll out on the boxes at the click of a button. Because of this, best-practice considers doing several groups of sites: test ones closer to themselves (better at the office),

    Switching traffic from one LTE modem to another works. We tested iperf with one session and still the same VLC video streaming. You take out the modem that the session is running on, a small drop in bandwidth occurs (iperf shows a drop of 40-60%), the video ripples a little for a couple of seconds, then everything is restored.

    Features Versa


    1. There is analytics. There are no external control systems, there is no need to separately monitor channels using PRTG, etc., all at once as it should.
    2. One controller can serve any number of devices, in fact it is web-scale. In particular, for a telecom operator or cloud provider, there is no need to deploy a separate SD-WAN solution for telco-cloud for each customer.
    3. If necessary, for important traffic there is a configuration of packet duplication on various channels to guarantee the delivery of content.
    4. In addition, TPM chips are used on the boxes. These are special modules for storing encryption keys on devices. On the initialization of the TPM module, a pair of keys is created - private and public. The private key cannot be read: there are no methods to access it, but there is an API to call the encryption-decryption method.
    5. Dynamic tunnels. When controllers act as BGP route reflectors, they dump information on routes to the final piece of iron of each site, and the spoke-to-spoke-tunnel is built only for traffic to appear between sites. And this also allows scalability of the solution to thousands of sites.

    Differences in the classic site-to-site ipsec and in Versa SD-WAN IPSEC:


    VPN cloud from Versa

    Another important point: the glands for mobile operators are several times cheaper, it is possible to purchase almost directly from large factories. To do this, the vendor gives the go-ahead to the factory to supply devices directly to a large client, and then it gives the factory basic firmware. The software is poured into the box, the box arrives to the telecom operator or cloud provider. The provider plugs in 2 parameters: the IP controller and the Internet access config (for example, static-IP). Further, the box is sent at least by mail to the customer. The box itself will find further what and how. The compatibility is wide, that is, immediately IP-telephony services, VKS - everything can be included.

    There are also standard Packet steering (SLA) and bonding - this is just combining a group of external channels into one pipe and prescribing the switching logic for each service. Moreover, the Versa or Riverbed solution automatically recognizes which service the traffic passes to communicate not at the port-type session level, but at the level of "give Skype priority to corporate Skype for buisness, and not give this to regular Skype video calls." Bonding helps solve problems with the long provision of the last mile. They stuck 3 LTE-modems from different operators and solved the issue with the availability of the site at 99,999.


    Dynamic tunnels between boxes at different sites

    Summary


    1. Now we have deployed a service for setting up these glands in our place, and not in the American cloud (like some vendors have) to make life easier for Russian companies.
    2. This solution is very suitable for mobile operators and cloud providers. For the first, this is an opportunity to sell to retail (for example, a network of laundries, grocery or car services), not only a channel for sending, say, reports and mail, but also an additional service for managing application traffic in a bundle of channels, as well as diverse monitoring of the quality of applications existing communication channels (the so-called managed service). By the way, as the global trend shows, at the moment the largest projects on SD-WAN (the number of connection points) are taking place in the financial sector and large retail.
    3. Very simple repair. If on the other side of the planet something broke in the grocery - the seller drags the cord from one “box” to another - and the old configuration is automatically loaded on the new one in 10-15 minutes. In general, even an accountant can handle it if desired.
    4. To work with SD-WAN, you don’t need an IT specialist in the branch, for any of the cases.

    Everything. I will answer questions in the comments. Well, or write to the mail: MKazakov@croc.ru

    Also popular now: