Header Only - DO NOT REMOVE - Extreme Networks

client roaming to prefered radio caused radius authentication event which failed


Userlevel 6
Currently i have a very strange problem.
We use EAP-TLS 802.1x Authentication for a internal SSID for notebooks. EWC is installed at the headquarter. 2x AP 3705 installed on the affected branch - we use V9.21.07. NAC Gateway 6.2.0.x installed also in the headquarter and is the RADIUS proxy to the NPS on the Windows AD 2008 Server. This working well over the last years.
Now we change the WAN connection of this branch from MPLS to VPN with IPSec. After this change a lot of internal WLAN clients which connected before without problems are rejected from the NAC Gateway. All other branches working well. At wired switches we use only MAC Auth which is also not affected.

Error:
802.1x (identify) - Authentication became stale

After some troubleshooting i realized that if the client roam within the AP to its prefered radio for that roaming event a radius request is triggered. The the first request (to the first radio) is always possitive (accepted) and then the AP internal switch to the prefered radio triggers a RADIUS request which is always rejected - with the above error message.

For a temporary solution i disable radio 1! And then all client can login without problems!

This is very strange.

First question:
Why do an switch from radio 2 to radio 1 trigger a radius event. Can i disable this new login request in the AP / EWC config?
Second Question:
If this request is needed why does it become stale and will be rejected?


Thanks for any advices.
Regards

22 replies

Userlevel 6
Hi I would guess the issue is with MTU = check the config for your APs and your VPN If I remember well the MACauthentication on the EWC does have option to configure if you want the reauth to happen or not. Go to the Wlan service => authentication. Regards
Userlevel 6
Hi Zdenek,

i check the MTU from headquarter to the AP with a "ping -f -l 1400 IP-of-the-AP" which is working fine with MTU of 1400. Also test with lower MTU which have no possitive effects.

Within the internal SSID is use 802.1x Privacy - no MAC Auth.

i can not understand why an inter AP roaming will trigger a complete new authentication request ? And why is the request will on the second run ? The first run to the first radio is always accepted ?

Regards
Userlevel 7
I assume radio preference is enabled and that is the reason the client is switching between radio 1&2 - correct ?

I also vote for a MTU problem.
Userlevel 6
Yes radio preference on the client is enabled.

But the fact that after disabling radio 1 - to avoid the inter AP roaming the problem is solved speaks against the MTU problem!
I also check the possible MTU size with different "ping -f -l max-packet-size"

Are there any suggestions how to find the root cause ?
Userlevel 6
I would try to capture (packet capture) the authentication packets to see why the authentication became stale => I expect that some packets are being lost. The question is where = client to ap, or AP to controller, or controller to radius server. (Can be configured as SITE = AP to radius server directly).
Userlevel 6
If your MTU is 1400, what value you have at your AP?
Userlevel 6
Pala, Zdenek wrote:

If your MTU is 1400, what value you have at your AP?

Hi Zdenek,

i testet with "ping -f -l 1400". So an MTU of 1400 Bytes are going through the network - so i configured the AP also with MTU = 1400.

Do i something wrong ?
Userlevel 6
Reagrding the reauthentication, I believe it is part of standard that authentication-association to new BSSID means new encryption keys generation. If your client does support OKC then you can enable it.
Userlevel 6
Pala, Zdenek wrote:

Reagrding the reauthentication, I believe it is part of standard that authentication-association to new BSSID means new encryption keys generation. If your client does support OKC then you can enable it.

Oppertunistic Keying is enabled already on this WLAN Service.
Userlevel 7
In my opinon it make sense to see a 2nd 802.1X authentication if radio preference is enabled as the client doesn't roam between the radios - it's a new connection.

I think as a workaround you'd also disable radio preference and enable radio#1 again - I'm pretty sure that will work.
Then enable it only on one AP so you'd troubleshoot the issue with the GTAC.
Userlevel 6
Ron wrote:

In my opinon it make sense to see a 2nd 802.1X authentication if radio preference is enabled as the client doesn't roam between the radios - it's a new connection.

I think as a workaround you'd also disable radio preference and enable radio#1 again - I'm pretty sure that will work.
Then enable it only on one AP so you'd troubleshoot the issue with the GTAC.

Hi Ronald,

we configure prefered radio on the client devices - windows driver settings.

I see this is possible via AP "Load groups", but this is not configured.

Regards
Userlevel 1
Did you try enabling fast roaming?

Regards
Userlevel 6
Matthias

Do you have AP secure tunnel and is NAT involved?

-Gareth
Userlevel 6
Gareth Mitchell wrote:

Matthias

Do you have AP secure tunnel and is NAT involved?

-Gareth

Hi Gareth,

Secure Tunnel is disabled completely. NAT is not involved!
Customers network is divided in Subnets in 10.x.x.x IP Range. HQ and Branch are connected via IP-Sec Tunnel without any kind of NAT.

Regards
Userlevel 6
Because we have a downtime based on this issue i open a GTAC Case to solve that - 01232203.
Could it be related to...
https://gtacknowledge.extremenetworks.com/articles/Solution/802-1x-client-authentication-takes-a-lon...
Userlevel 6
Hi Folks,

this problem is still unresolved!!

GTAC tell me this solution:
https://gtacknowledge.extremenetworks.com/articles/Solution/Apple-clients-take-very-long-time-to-get...

i get a wireshark trace of a rejected end-system which emphases this guess:
NPS is not possible to bring the Server certificate to the client! (and then the request is rejected)

The problem of the above solution is that it only works if NPS will accept the RADIUS Request. So clients are still rejected (because of too big MTU). The reduced Framed-MTU will never reaches the problematic clients!

If i debug the RADIUS request on NAC Gateway i see the Framed-MTU value is set to 1400 (Request from EWC).

Can i change this value on the EWC?

My first guess is this is calculated based on the used AP-MTU Size. But after i changed AP MTU to 1300 is see that the Framed-MTU does not changed (1400). So this seems to be fix in the EWC Config. But from my point of view this should be calculated in conjuntion of the set AP MTU!

Regards
Userlevel 7
Hi Matthias,

I do not quite understand the MTU problem. You wrote that you can ping the AP with a 1428B IP packet (1400B ICMP Echo Request data + ICMP header + IP header), and that a Framed-MTU of 1400 is used. That seems to fit.

Additionally you write that authentication works fine with one radio disabled. That suggests that the network is able to transport the certificate.

But then you write that the server certificate cannot be transported to the client.

I would guess that one packet containing part of the certificate is lost on its way from the server (NAC) to the client (AP), ultimately resulting in a reject.

It is interesting that there seems to be reliable packet loss with two back-to-back authentication attempts. As if that crossed some rate limiting threshold.

HTH,
Erik
Userlevel 6
Erik Auerswald wrote:

Hi Matthias,

I do not quite understand the MTU problem. You wrote that you can ping the AP with a 1428B IP packet (1400B ICMP Echo Request data + ICMP header + IP header), and that a Framed-MTU of 1400 is used. That seems to fit.

Additionally you write that authentication works fine with one radio disabled. That suggests that the network is able to transport the certificate.

But then you write that the server certificate cannot be transported to the client.

I would guess that one packet containing part of the certificate is lost on its way from the server (NAC) to the client (AP), ultimately resulting in a reject.

It is interesting that there seems to be reliable packet loss with two back-to-back authentication attempts. As if that crossed some rate limiting threshold.

HTH,
Erik

Additionally you write that authentication works fine with one radio disabled.
--> i believe that because there was an accept in NAC Manager GUI.
This was a mistake by me.
Userlevel 7
Erik Auerswald wrote:

Hi Matthias,

I do not quite understand the MTU problem. You wrote that you can ping the AP with a 1428B IP packet (1400B ICMP Echo Request data + ICMP header + IP header), and that a Framed-MTU of 1400 is used. That seems to fit.

Additionally you write that authentication works fine with one radio disabled. That suggests that the network is able to transport the certificate.

But then you write that the server certificate cannot be transported to the client.

I would guess that one packet containing part of the certificate is lost on its way from the server (NAC) to the client (AP), ultimately resulting in a reject.

It is interesting that there seems to be reliable packet loss with two back-to-back authentication attempts. As if that crossed some rate limiting threshold.

HTH,
Erik

OK, so the presumed workaround did not actually work around the problem. 😞

I would suggest you use wireshark (or similar) to check the actual size of the RADIUS communication packets, because your test with ping and the Framed-MTU value suggest that the MTU size is not the problem.

If the VPN MTU is the problem, you should be able to see RADIUS packets on the interface leading to the VPN, but not on the other end exiting the VPN. There might even be ICMP Packet Too Big message visible in a packet trace. If the MTU is the problem, no larger packet at all can cross the VPN. You can verify the actual VPN MTU using "ping -f -l SIZE" (on Windows). The generated IP packet will be 28 bytes bigger than SIZE, you can see this in wireshark.

If a packet from RADIUS server to the AP containing part of a certificate is lost, this authentication session can only succeed if the re-transmit timer and count of the RADIUS client match the RADIUS server settings for re-transmits. If a re-transmitted packet arrives after the time allowed by the server, the server will answer with an Access-Reject.

Extreme Control accepts an answer inside a 5s window after sending a packet. If it takes the client longer to request a re-transmit for a lost packet from the server, the authentication session will fail (Access-Reject). This problem exists only with bigger RADIUS messages needing more than one packet (e.g. a certificate), because in most other cases the RADIUS server will treat the re-transmit request as a completely new session.

Anyway, if the problem really is an MTU problem because of the VPN, you might be able to fragment inside the VPN despite the dont-fragment bit. This of course depends on the VPN solution used.

HTH,
Erik
Userlevel 6
Hi Folks,

i am back from my holiday and surprise, surprise some (not very much) problems are solved!

My WLAN Problem is solved!!

The root cause was a bug in the FortiGate OS. It seems that all CAPWAP/WASSP traffic with not handled correctly by the firewall. My co-worker see this during a firewall debug. As a work-around he disable ASIC offloading for CAPWAP Packets and at once all wireless clients are running without problems!

After an update to the lastest FortiOS 5.4.1 all running well.

Thanks to all that help me with ideas!

Regards
Userlevel 6
M.Nees wrote:

Hi Folks,

i am back from my holiday and surprise, surprise some (not very much) problems are solved!

My WLAN Problem is solved!!

The root cause was a bug in the FortiGate OS. It seems that all CAPWAP/WASSP traffic with not handled correctly by the firewall. My co-worker see this during a firewall debug. As a work-around he disable ASIC offloading for CAPWAP Packets and at once all wireless clients are running without problems!

After an update to the lastest FortiOS 5.4.1 all running well.

Thanks to all that help me with ideas!

Regards

That's a great way to come back from holiday Matthias.

Thanks for updating the Hub with your good news and valuable information who read in the future.

Reply