Header Only - DO NOT REMOVE - Extreme Networks

Wireless clients sporadically getting Limited Connectivity, APs dropping packets


Userlevel 2
We have a pair of V2110 wireless controllers and around 400 3825i access points. Controller firmware is 09.21.06.0002. AP firmware is 9.21.27.1387X.

About 4 months ago we noticed a problem where the APs would just drop traffic. A packet capture will show a ping coming into the radio from the client, going out the Eth interface on the AP, the reply coming back into the Eth interface, and never going out of the radio.

A particularly strange detail is that this only happens with certain source/destination traffic. For instance a client cant ping the gateway of 10.1.20.1/24. but cannot pint the server at 10.1.20.2. Since we can see the ping making it to the server and the ping reply making it back to the AP, we know it's not a firewall/ACL/routing issue on any other part of the network. The traffic disappears at the AP. Clients cannot communicate with anything on the Internet or any of our servers (DNS, DHCP, etc.) during this time. They can ping gateways in any subnet but that's it; everything else is failed.

More details:

-We've gone through several firmware updates with GTAC. Nothing has helped.
-The issue can affect any client on any AP in any location at any time.
-The issue is intermittent; it may happen every few minutes or once per month to a particular client in a particular area. Once failed it may last for seconds or days.
-While it's affecting one client, other clients on the AP may be working fine or may also fail.
-It can happen on either 2.4 or 5.7GHz.
-It can affect any type of client (phone, laptop, tablet) and any OS.
-Restarting the client or AP will sometimes fix the issue, sometimes not.
-It happens with our without using NAC, or any other type of authentication. It happens on completely open networks as well.
-It happens on both bridged at AP and bridged at controller topologies.
-It's not limited to a particular subnet or VLAN.
-It happens with or without Flexible Client Access enabled.
-It happens on either controller.
-GTAC hasn't made much progress on the issue in 4 months.

164 replies

Userlevel 3
9.21.06? You must have an early release. The newest I can see is 9.21.05.

Were your controller upgraded from V8 code?

We ran into an issue at a customer site where clients would connect, but would not get a DHCP address. It seemed to be at random times/locations. If you walked around one of the sites, you would suddenly get an IP and be online and working.

All went away once we recreated the VNS. We figured that somehow, over time/updates, the VNS/WLAN got corrupted. Recreating it with the same filters/roles/COS settings resolved our issue.

We have not had the issue return since.

Thanks,

Bill
Userlevel 7
Bill Handler wrote:

9.21.06? You must have an early release. The newest I can see is 9.21.05.

Were your controller upgraded from V8 code?

We ran into an issue at a customer site where clients would connect, but would not get a DHCP address. It seemed to be at random times/locations. If you walked around one of the sites, you would suddenly get an IP and be online and working.

All went away once we recreated the VNS. We figured that somehow, over time/updates, the VNS/WLAN got corrupted. Recreating it with the same filters/roles/COS settings resolved our issue.

We have not had the issue return since.

Thanks,

Bill

9.21.06.006-1 was made available for download on the Extranet moments ago: https://extranet.extremenetworks.com/downloads/Pages/WirelessControllers.aspx
Userlevel 2
Correct, we were sent an early release by GTAC.

This issue happens on every VNS (we've created 4 or 5 new ones for testing at this point).

I'm not sure which version of code our controllers started with. Might there be a log of that somewhere in the software?
Userlevel 3
Logs/SW Upgrade I think

Have you tried to spin up a new fresh 2110?
Userlevel 2
Are you using Radio Preference load groups ?
Userlevel 2
In one area (5 APs) on campus we are, but we haven't had any issues in that location.
Userlevel 2
Have you had any problems since?
Kinda looking at the same situation.
3825i and a V2110 medium at the same patch level.
Userlevel 3
I had the same problem and noticed that everytime it happened there was a kernel error on the AP. So it's an AP problem. Apparently there is a fix coming out for this at the end of the month. I just installed 9.21.07.0002 code yesterday and haven't seen the issue yet.
Userlevel 2
on 09.21.06.0006 at the moment.
Userlevel 7
Code 9.21.07 should address this issue. The ETA on this code should be the end of this month...

Reference: https://gtacknowledge.extremenetworks.com/articles/Solution/Users-randomly-lose-network-and-internet...
Userlevel 7
Hi Doug,

any update on the release date - I've a upgrade window tomorrow and would prefer 9.21.07 over 9.21.06.

Thx,
Ron
Userlevel 2
Any update on the 9.21.07 code ?
Userlevel 7
Should be posted on or before 3/2
Userlevel 2
9.21.07 did not solve the issue for us. It changed it slightly; now we can't ping anything at all when it happens instead of not being able to ping host addresses before.
Userlevel 2
How often is the problem happening for you ? We have been getting reports of students having internet problems, but the more I talk with them, it seems like it is more wide-spread for us than I originally thought. Disappointed to here that 9.21.0.7 did not fix the ongoing issue. What's next ?
Userlevel 7
We have many beta test accounts where code 9.21.07 has fixed connection issues. I would suggest anyone that is still having issues post 9.21.07 to please contact or continue to work with GTAC support to iron out any site specific problems.
Userlevel 2
It happens often. We have hundreds of open calls, with most of those users being affected daily.
Userlevel 2
Are you able to tell in anyway on the back-end through AP traces/logs or anything like that when it happens ? I originally thought our problem was limited to 2.4ghz, but I think that was another issue that was corrected in earlier 9.21 code. We have had some students express frustration with wifi at times recently and I have been working with them and it seems like most of them point back to the issue discussed here.
Userlevel 2
Nope, the only way to see it for sure is to catch a client while it's failing. It's often working by the time we get to the room, and fails again just after we leave. Very frustrating.
The traces I have show hostapd thinking it sends out a response to EAPOL start, but you never see it in the air, and so auth times out. For us it's reliable, once an AP is broken all 802.1x clients will be broken on it. Also it takes a day or so after a reboot to start happening, so for a while I was rebooting all APs overnight.

I'm trying downgrading to 09.21.03 while I wait for this to be fixed - there was a similar problem with no EAPOL response that was supposedly fixed in 09.21.04, but it was at least less frequent.
Userlevel 2
Are you on 9.21.07 also ?
JP wrote:

Are you on 9.21.07 also ?

I have 09.21.07.0005, which does at least fix 802.11r-related crashes on AP3825 (although I have 11r turned off now for the time being anyway).
Userlevel 7
9.21.07.0007 is released and available for download on the server
Userlevel 3
We have the same problem. 480 ap's and two controllers V2110 with 3805i and 3825i... the situation is frustrating.
we are in 09.21.5 in controller but in some buildings we have installed a 9.21.07.0002, but the problems persists.
does th 09.21.07.0005 solve de problem?
are there any way to reboot all the aps automatically overnight?
regards
I hope that the update is published as soon as possible...
Userlevel 7
FES wrote:

We have the same problem. 480 ap's and two controllers V2110 with 3805i and 3825i... the situation is frustrating.
we are in 09.21.5 in controller but in some buildings we have installed a 9.21.07.0002, but the problems persists.
does th 09.21.07.0005 solve de problem?
are there any way to reboot all the aps automatically overnight?
regards
I hope that the update is published as soon as possible...

9.21.07 is on our GA download site now. The beta build you have was our first attempt at the fix. The GA build has the most current update fixes in it.

Reply