Roaming in WiFi networks - 802.11i / r / k / v / OKC, what we really need and how to recognize it

When it comes to roaming, this concept usually hides two different processes. In the world of cellular networks, which came to us earlier, roaming means the ability to work in a "foreign" network, and not at all seamless migration between base stations (handover). The imperceptible movement between the BSs of the cellular network is so natural that little is remembered about it at all.

In the world of WiFi, things are different, and roaming usually means an invisible movement for users between access points of the same network - the BSS transition, although the widespread introduction of SMS authorization in the near future should encourage operators to implement a standard for roaming between other WiFi networks in the style of cellular infrastructure and based on her identification.

The following is a description of existing roaming technologies and methods for their detection on unfamiliar equipment, it is assumed that the reader is familiar with the basic principles of WiFi.

If you evaluate the switching roaming (which is handover) in a WiFi network from the position of cellular networks, the most accurate description would be - it is NOT, is not provided by the standard, and for many years the situation has not changed. In cellular networks, switching the subscriber to another BS initiates the network controller based on information messages from the client, evaluating the signal on the client from neighboring bases, in WiFi the client always makes the decision to switch - the base can only tell how to do this faster. But in WiFi there are many standardized “crutches” that quite successfully allow you to set the process of changing the access point to 50 ms and save the subscriber a voice call over IP, as well as non-standardized designs of each manufacturer, which can both help and aggravate the already sad process (Ubiquity Zero HandOff is an example when a crutch did worse than before). Here you can easily throw a stone at the author, but what about 802.11r / v - but they are not at all mandatory within the framework of WiFi, not supported by all devices, and do not imply anything like forced translation with bandwidth reservation. The choice of where and WHEN to switch - all the same remains with the client. Moreover, the inclusion of 802.11r will lead to the inability to connect to the network for older clients, as This is a required option in an 802.11 frame! In some cases, it’s not what you don’t need, but harmful (old drivers, printers, scanners, etc. devices).

Theory


Having common introductory notes, it’s worthwhile to briefly describe what and why can be useful in WiFi for roaming (we will call it that).

802.11i


The 2004 amendment introduced in the standard in 2007 focuses on security and describes authentication and encryption (WPA2). We are interested, because the key exchange procedure and interaction with external resources (RADIUS) together greatly slow down the client switching between APs. The first principle of quick reconnection is described - storage of the PMK key, though only for those points where the client has already completed the full procedure once - i.e. quick return to the network.

OKC (Opportunistic Key Caching)


The first known crutches, during the 802.1x authentication process, the access point saves a pairwise master key (PMK) for each client, the idea was that this key should be transmitted to neighboring points through the controller - thereby eliminating new calls to RADIUS and simplifying the exchange, significantly reducing time to switch to a new point. It is not part of the standard, hence all the consequences, but it is supported by all serious manufacturers of WiFi-iron and some customers. Without support from the client, the function is useless, but for WPA2-PSK as well. Some vendors forcibly try to use the method, seeing the stored key, even if the client did not request it in Request, it sometimes works.

802.11k


Radio Resource Management , amendment of 2008, since 2012 in the standard, option. The access point with a flag indicates Beacon support for the option, when requested by the client, sends it a list of neighboring points, the client does not spend time scanning all available channels and immediately switches to the desired one and selects a new point. The battery is saved, the High-Load also improves the overall condition of the ether. Together with 802.11v it can make the life of clients comfortable enough not to think about other technologies (after all, the client chooses the candidate point anyway) - unless of course VoIP and 50 ms magic are important for WPA2-Enterprise. It is useless without customer support.

802.11v


Wireless Network Management (WNM)amendments published in 2011 and in 2012 entered the standard, a large number of options. The main purpose is effective management of the wireless environment — exchange of environmental data between stations, energy saving of the client, improvement of the roaming and balancing process — messages are sent to the client with suitable APs, which addresses the problems of point overload (Load-Balancing) and “stuck” clients with a weak signal, and some other features. Assisted Power Saving sets the maximum timeout for the client, without requiring frequent keep-alive messages, the Direct Milticast Service allows you to receive multicast frames at the client’s connection speed, rather than cell speed - which frees up the ether and saves the battery (when roaming, these functions do not apply). But BSS Transition is very relevant - in its framework there are 3 types of messages, this is a request from the client to indicate suitable points, and two messages from the point - Load Balancing Request in case the point is overloaded, and asks the client to switch to another and Optimized Roaming Request if the RSSI and Data Rate parameters do not meet the minimum requirements of the AP. It is important to note that these are advisory messages and actions are left to the discretion of the client. Forced disconnection is possible only within the framework of the proprietary Band / Load Steering / Balancing technologies, and may be incorrectly worked out by the client, or completely ignored (it is disabled by Disassociate frames). that these are recommendation messages and actions are left to the discretion of the client. Forced disconnection is possible only within the framework of the proprietary Band / Load Steering / Balancing technologies, and may be incorrectly worked out by the client, or completely ignored (it is disabled by Disassociate frames). that these are recommendation messages and actions are left to the discretion of the client. Forced disconnection is possible only within the framework of the proprietary Band / Load Steering / Balancing technologies, and may be incorrectly worked out by the client, or completely ignored (it is disabled by Disassociate frames).

Using 802.11k / v together gives a good result, and in most cases home and small office networks are enough for customers without creating problems in the operation of various devices. Next comes heavy artillery - it radically solves the main problem, but can cause side effects - this is 802.11r.

802.11r / ft


Fast Roaming / Fast BSS Transition - 802.11r is mandatory for the client when used on a point, i.e. those who don’t support it, cannot connect - this is a flag in the control frames and a changed key exchange mechanism, if the subscriber is old and does not know about its existence, it has a problem (on new devices, even if there is no support for the function, they sometimes add understanding of this flag, although according to the standard, the protocol must be fully implemented). It can also bring down incorrect driver adapters for old client adapters - the thing is when you use the initial 4-Way Handshake for distributing the shared key, this is what the standard says: “A STA shall not use any authentication algorithm except the FT authentication algorithm when using the FT Protocol. "

Fast BSS Transition works with RSNA (Robust Security Network Association - WPA2) networks and fully open networks. For WPA2-PSK, the meaning of fast roaming is lost. the client and the point still exchange 4 packets, there is nothing to speed up here. The calculations do not take into account the time to search for a suitable point, but for the 5 GHz band it can be pretty - you need to scan 16 channels and find a suitable AP, so the general strategy is to use the k / v and r protocols together.

If you use RADIUS for authorization and want very fast roaming - you have no choice, only 802.11r!

In addition to the roaming in 802.11r, it is potentially possible to poll the point about the availability of the resources necessary for the client and reserve them (QoS). Accordingly, there are two subspecies of protocols - FT Protocol and FT Resource Request Protocol. Communication between the client and points can occur both directly through the air (Over-the-Air), and through the used point and controllers (Over-the-DS) - the second way is a little longer. QoS request from a point on clients has not yet been implemented and is practically used anywhere.

The most important element of the frame is MDE, the Mobility Domain Element, which is necessary for successful roaming, which is possible only within one domain.

Time spent on client switching depending on the standard (“Performance Study of Fast BS Transition using IEEE 802.11r” by Sangeetha Bangolae, Carol Bell and Emily Qi):



Please note that this is a “clean” switching time when the client has already decided that the connection getting worse, and found a new point!

The practice of 802.11r roaming is perfectly described in the article antonvn , I see no reason to repeat myself.

But the work of other additions can be considered as an example. Adding a line to the datasheet is not difficult, it is more difficult to make this line work. I have a couple of Adtran Bluesocket points at my disposal (BSAP 1925), this is the lower middle range, which is far from reaching the market leaders in terms of functionality, but provides good integration capabilities into the carrier network and good stability and performance. If you have only 2-3 points in one company, there is little sense for you in them (only if you rent with a cloud controller), but for distributed or large-scale networks (10-20 +), it is already becoming interesting. Next to them are Cambium - now they are not at hand for tests, but colleagues praise them. According to the description, Cambium has a little more functionality than Bluesocket (there is 802.11r, more types of tunnels for user traffic, the ability to work up to 24 points without an external controller, etc. on trifles), while Bluesocket has only 802.11k / v / OKC so far - they promise full roaming r in the next software. Aruba / Cisco / Ruckus predictably can do everything available on the market - the real question is whether you will really use it. Testing low-cost equipment is often an ungrateful task, they brought us Edimax about a year ago, the stability of the control portal then raised big questions, which was completed without testing the depth of functions. There are doubts that in this price category they were able to organize a full-fledged monitoring of the air and alerts about the client’s neighbors, it is interesting if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). 11k / v / OKC - full roaming r promise in the next software. Aruba / Cisco / Ruckus predictably can do everything available on the market - the real question is whether you will really use it. Testing low-cost equipment is often an ungrateful task, they brought us Edimax about a year ago, the stability of the control portal then raised big questions, which was completed without testing the depth of functions. There are doubts that in this price category they were able to organize a full-fledged monitoring of the air and alerts about the client’s neighbors, it is interesting if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). 11k / v / OKC - full roaming r promise in the next software. Aruba / Cisco / Ruckus predictably can do everything available on the market - the real question is whether you will really use it. Testing low-cost equipment is often an ungrateful task, they brought us Edimax about a year ago, the stability of the control portal then raised big questions, which was completed without testing the depth of functions. There are doubts that in this price category they were able to organize a full-fledged monitoring of the air and alerts about the client’s neighbors, it is interesting if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). Testing low-cost equipment is often an ungrateful task, they brought us Edimax about a year ago, the stability of the control portal then raised big questions, which was completed without testing the depth of functions. There are doubts that in this price category they were able to organize a full-fledged monitoring of the air and alerts about the client’s neighbors, it is interesting if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). Testing low-cost equipment is often an ungrateful task, they brought us Edimax about a year ago, the stability of the control portal then raised big questions, which was completed without testing the depth of functions. There are doubts that in this price category they were able to organize a full-fledged monitoring of the air and alerts about the client’s neighbors, it is interesting if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). I wonder if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!). I wonder if someone can verify this in practice. Ubiquiti roaming does not yet support, as well as Mikrotik (which is a pity!).

It should also be noted that the presence of the neighbor notification function does not make much sense if the point does not know about them - i.e. you need a background scan of channels and search for neighbors on them . The fact is that in normal mode, points work only on their channel, and simply can not know about neighbors on others! The solution with installing all points on one channel was tested by Ubiquiti, proving in practice that this is a bad idea (no one doubted that) - the capacity drops dramatically.

Equipment used


Two Bluesocket BSAP 1925 access points are used, traffic is removed by two laptops - on one AirMagnet WiFi Analyzer PRO software paired with an AirMagnet PCI Express Card 3 X 3, the second laptop for catching traffic on another channel - 2016 MacBook with 802.11ac adapter. Judging by the dump, he coped with his task, the program Airtool version 1.6 was used. Why not from one laptop? We have 3 more Proxim Orinoco a / b / g / n USB adapters just for the purpose of simultaneous removal from 3 channels, but they, as it turned out, do not work with most of the traffic on modern networks. As soon as any fresh client or point appears on the air, the analyzer stops seeing most of the traffic. Why did we try to figure it out? As a result, without getting to the bottom of the deep details, they spat, something is changing in the frame, 802.11ac is probably to blame. The vendor reports that this is a physical feature of the adapters, and there will be no fix for them, keep in mind! As a result, just recently, Airmagnet released a software update and new USB adapters for it, but we do not have them yet. And you most likely will not have, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can go into monitor mode and parse all traffic in the 5 GHz band. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. keep in mind! As a result, just recently, Airmagnet released a software update and new USB adapters for it, but we do not have them yet. And you most likely will not have, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can go into monitor mode and parse all traffic in the 5 GHz band. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. keep in mind! As a result, just recently, Airmagnet released a software update and new USB adapters for it, but we do not have them yet. And you most likely will not have, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can go into monitor mode and parse all traffic in the 5 GHz band. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. And you most likely will not have, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can go into monitor mode and parse all traffic in the 5 GHz band. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. And you most likely will not have, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can go into monitor mode and parse all traffic in the 5 GHz band. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost. To understand the real transition time, you need to run a dump on one computer for 2 channels with two independent adapters, because when using 2 different machines, it is extremely difficult to accurately synchronize time (I’m not sure of accuracy, but it’s about milliseconds), and when one adapter passes through two channels, half of the traffic will be lost.

Testing was done at home with friends, one point was in the kitchen, the other in the room, separated by a main wall and saw each other with a minimal signal. The situation and the presence of neighboring networks are similar to small offices. The phone almost immediately, when it entered the room, switched to a point in it. To complicate the task, the transition was made quickly, and the signal rapidly dropped immediately when leaving the room around the corner - such a scheme more checked the client's work, the points did not have time to use the balancing system.

Work with dumps


Dumps were viewed in the free and accessible to everyone Wireshark.

802.11k
A point announces the ability to send a list of neighbors in Beacon frames:



The client, if desired, receives a list of points by its SSID, sends an Action Frame. In my case, the client requests a list of neighbors after connecting to the SSID (Wireshark filter by frame type wlan.fc.type_subtype eq 13):



A response with a list of neighboring points to the Neighbor List Report client from the current AP, indicating which channel and which point the client should look for:



802.11v
It was not possible to find traces of 802.11v operation - to activate the balancing operation, it is necessary to load the point well, and this time it was not possible to wait to see what happens on the air. Bluesocket declares an adaptive balancing system that always allows the client to connect to the desired radio, and then switches it if necessary. It makes no sense to squeeze everyone out by default at 5 GHz, when 2.4 is empty, also the client does not always have a sufficient signal level to use the five, but it is prevented from connecting to the deuce. According to experience, balancing works, but I still have not managed to completely catch its work in the dump - this time I got only Disassociate messages after a signal drop and no response from the client, but the client already reconnected at that time. Fresh devices, like my Xperia Z5, immediately connect to 5 GHz, all new Apple devices do likewise. I limited myself to checking the provision of neighbors, and a dump of roaming on two channels at the same time. In the process of parsing the switch, I saw something quite interesting - the delay in the transmission of certain packets by the device when the channel is already installed and working, but there is no application traffic for a long time. So when actually testing a specific application, it is necessary to take into account the features of its operation and the network stack of your device - it is quite possible that not the WiFi, but your software is to blame for the delay! but there’s no application traffic for a long time. So when actually testing a specific application, it is necessary to take into account the features of its operation and the network stack of your device - it is quite possible that not the WiFi, but your software is to blame for the delay! but there’s no application traffic for a long time. So when actually testing a specific application, it is necessary to take into account the features of its operation and the network stack of your device - it is quite possible that not the WiFi, but your software is to blame for the delay!

Client Stack Features


Next is the most interesting. Dump from channel 44, where the client switched. The dump shows that from the moment of the first request to a successful key exchange, 46 milliseconds pass - no 802.11r is needed when using WPA2 preshared key. It all comes down to how quickly the client understands the need for switching and finds the right point. But this is not the most interesting, the interesting thing is that the traffic of the test application was absent for another 3 seconds! For clarity, ping was launched with an interval of 15ms, the interval was not always respected due to the features of WiFi and the lack of priority on traffic (Best Effort). Ideally, of course, you need to test with something more reasonable, but the program for starting ping was already on the device, so they were content with it.

Authentication and successful connection:



After connecting, network traffic appears, but this is not ICMP, but some other packets! And only after 3 seconds ICMP requests appear:



This is what happens at the source access point at this time, it is difficult to say whether the client started the connection procedure to a new point before completely disconnecting from the source, as follows from the dump, because the time may not be accurate:



After the access point receives the last packets from the client with a signal level of -80 dBm, and then the client does not confirm several packets, the point sends Disassociate messages to it. Probably, the client at this time is already successfully transmitting on the new channel, because no one is stopping him from switching to it to scan for available points without disconnecting from the current one, and in this case, you do not need to spend a lot of time.

Visually, some switching delay is present, pings freeze - but as the dump showed, this is not a WiFi problem. A complete disconnection from the network on the device did not occur, the signal drops when moving between rooms, and then quickly returns to high values.

In case of supporting the BSS Transition functionality , its presence in the dump is detected by the specified flag - Probe Request frame from the client:



Conclusions?


You should not chase technologies for the sake of technologies, they do not always play a decisive role. Even with the most fashionable WiFi spots, the last word is up to the client. Focusing on the information provided, you yourself can check your points for compliance with the needs and functionality stated in the description, and choose the technologies that you need.

Proper placement of points on the premises and network planning will allow you to get good results even with low-cost equipment, in the same way, using top-end iron, you can easily ruin a project with rash installation.

Also popular now: