Trying to improve fast roaming

  • 0
  • 1
  • Question
  • Updated 2 months ago
  • Answered
Hi,

I'm new to the WiNG software so still getting to grips with the interface and features.  I'm no WiFi Expert but can normally find my way around.

In my new role I have inherited 3 x NX-5500 (v5.8.0) controllers and about 100 AP's (AP650 & AP7522).  

One of the WLANs (WPA2 with Passkey) is used in the warehouse where the terminals (FLT Mounted & MC Handheld Guns) are roaming around the warehouse.  There were previously WAY too many AP's in the warehouse but this has been reduced now and we have used heatmapping software to check the signal strength around the warehouse.  Also the channels in use have been staggered.to reduce any overlap, but SmartRF is not enabled.

The application on the devices constantly talks back to the server updating pick & Pack instructions.

We notice on the handhelds, if running a constant ping, that they occasionally lose single pings as we walk through the warehouse.  I am assuming this is as they switch to a different AP.

However on the FLT mounted units, as they fly at breakneck speed through the warehouse, the drop out is much more pronounced and because it drops for so long, the software times out.

If we change the network settings and disable WPA, reducing to an Open network locked down to a MAC based ACL, then the drop out is less pronounced and the software is able to recover the dropouts.

Summary, it seems that when encryption is enabled and the FLT units roam quickly around the warehouse the problem is worse.  With Encryption disabled, the problem is slightly better.

Q1: Is there any general performance benefit to setting the WLAN to work in Tunnel mode rather than local (currently Tunnel)?
Q2: Is there any roaming benefit to using Tunnel based mode over Local?
Q3: I understand the negotiation with Encryption enabled takes longer but I assumed we were talking in the milliseconds, so why would the roaming take so much longer on the devices that are travelling faster?
Q4: If Roaming assistance is applied in the configuration, is this likely to help the issue or am I likely to still get the dropouts as the unit switches between AP's.


Apparently my employer has previously had a Motorola/Zebra engineer out to look at this and they never got to the bottom of it.  Either, this problem has flummoxed the experts or the expert was a bit **** at his job.

Thanks in advance for any help
Photo of MarkS

MarkS

  • 90 Points 75 badge 2x thumb

Posted 2 months ago

  • 0
  • 1
Photo of Andrew Webster

Andrew Webster

  • 1,878 Points 1k badge 2x thumb
While this might not be the true source of the problem, what you describe does seem to point to there still being too much signal. You indicated that heap mapping was done, but what software was used and how was this carried out?  Where the connection speeds measured at the same time?  For instance, just because you see -70dBm doesn't mean you have a good connection.

Basically, when WPA2 is in use, the 4-way EAPOL handshake must complete, as you pointed out, in a time frame measured in milliseconds, however, there are a couple of factors at play.
The 4-way handshake usually takes place at the lowest speed setting, consequently it takes more airtime to complete, but also the signal can be decoded at the furthest distance.  That being said, it also interferes with other handshakes taking place at the same time on the same channel.  For example 1mbps will happily decode at -92dBm, so your cell size might be much larger from a handshaking perspective than from a data transmission perspective, meaning that while the handshake could complete, the data path is no good.
You can verify this by performing data captures on the APs and looking at the reported signaling rates in the capture.
You can also look at the logs and if you spot many Reason Code 15 or 16 messages, this is indicative of the aforementioned problem.

There are fast roaming options that can be enabled on the wifi infrastructure, but you don't indicate if they are present or not, and be aware that not all client devices behave in the same way when these functionalities are enabled, but that can speed-up the 4-way handshake.

You also need to be mindful of the number of SSIDs that are being used, hidden or not, these consume airtime, for instance if you have 8 SSIDs in use, you have an overhead of 25%, and if there is even a hint of channel overlap, that can quickly go up to 100%.  Anything greater than 6 SSIDs is going to cause problems, and if you can keep it to 3 or 4 you'll be in much better shape.



To answer your questions specifically:

Q1: Is there any general performance benefit to setting the WLAN to work in Tunnel mode rather than local (currently Tunnel)?  No, but tunnel mode does impose bandwidth constraints on your infrastructure, whereas local bridging does not.  There's lots of documentation/presentations about this available from Extreme.
Q2: Is there any roaming benefit to using Tunnel based mode over Local? No, not unless your network topology is broken and you can't fix it.
Q3: I understand the negotiation with Encryption enabled takes longer but I assumed we were talking in the milliseconds, so why would the roaming take so much longer on the devices that are travelling faster?  See above
Q4: If Roaming assistance is applied in the configuration, is this likely to help the issue or am I likely to still get the dropouts as the unit switches between AP's.  Roaming assistance will likely make the problem worse.  It is designed to force sticky clients to roam (iphones for example).  MC devices are much better at roaming on their own.



Photo of MarkS

MarkS

  • 90 Points 75 badge 2x thumb
Hi Andrew,

Thanks for your reply.

I'm not sure what software was used for all of the mapping and surveying, different consultants were used and it's all before my time with the company.
One of my predecessors produced a heat map using Ekahua.  I've never done heatmaps before so I'm just looking around for something to use.

Initially one company came in and suggested way too many AP's.  These were installed and they had issues.
Another consultant came and said there were way too many APs and a good 60-70% were removed.
There is now one every other aisle, in alternating aisles.  Channels have been configured to to try and put the greater distance between other AP's on the same channel.
Also, 5Ghz band has been disabled

I'm not too sure what fast roaming options are available as I've never needed to specifically configure anything for fast roaming before, so I'm about 48 into investigating this problem and learning as I go.
I believe there was a BSS Fast Transition option but I may need a firmware update to get the feature.

Mark
Photo of Andrew Webster

Andrew Webster

  • 1,878 Points 1k badge 2x thumb
Hi Mark,

Ekahau is the de-facto industry standard for heat-mapping and all things related to wireless troubleshooting, but you do need to understand the physics behind RF propagation to understand what's really going on.

As a general rule too many APs can present similar symptoms as too few APs. 

A firmware update would be a good idea.  You need to have a support agreement in place on your APs in order to access this at Extreme.  If will need to stay in the 5.8 branch for your AP650s to remain functional, then 5.8.6.9 is the latest version.  Keep in mind that 5.8.0 was released in fall 2015, and there have been many changes since.  
Fast roaming comes in several flavors, there is Pairwise Master Key (PMK) caching, only useful if you are doing WPA2-Enterprise authentication;  Opportunistic Key Caching, all APs receive the client's PMK ID, again only useful with WPA2-Enterprise, Pre-authentication (802.11i), Fast BSS Transition (802.11r), Fast BSS Transition over DS (802.11r over DS), Radio Resource measurement (802.11k) which helps determine what the clients (on supported supplicants) are hearing.

One other thing that might be brought to bear is extending the number of retries and timeout within which the EAP handshake must take place.  This is more of a band-aid, but can help in situations where there is too much background moise that is interfering with the low level protocols.
wpa-wpa2 handshake [attempts <1-5> (2 default) | init-wait <5-1000000> (no default) | priority [high|normal] (high default) | timeout <10-5000> {10-5000} (500 default)]
YMMV

Using CLI packet captures on the AP or controller you can capture a handshake sequence directly from one or more APs and load it up in wireshark to see what's really going on.  (timeouts, missing/damaged packets, etc).

Photo of Richard Augusto

Richard Augusto

  • 468 Points 250 badge 2x thumb
In our Warehouse (more 64 APs) we not have problem with collectors (Symbol/Motorola only)

You checked if you AP650 are Single Radio or Dual Radio? In our Warehouse we use 2.4 Radio only. 

Update for most recent version. I use 5.1.9.3 (upgrade to vx9000 and 5.9.1.5 - very stable now)

One SSID is dedicated for collectors on warehouse. Only one.

Smart-RF is enabled.  11 to 15 dBm.

Tunnel or Local:  in my case only in very high deployments, Local is superior. Distributed processing.

Disable 802.11d on collectors and Power Save (display and wifi).

Update Fusion Driver (if you are using Zebra/Motorola Collectors).

Good luck!
Photo of MarkS

MarkS

  • 90 Points 75 badge 2x thumb
Thanks everyone, I'll take this away and see how far I get.

Mark