Switch Feature: RADIUS Dead

  • 2
  • Idea
  • Updated 2 years ago
During a current project i learned that cisco supports a great feature that i missed currently (and a long time before) at all EOS and EXOS switches !

Marking a unreachable RADIUS Server as DEAD and skip them at all following Authenication session. If the RADIUS Server is accessable again use it again.

EXOS and EOS have to beat every Authentication session through the configured timeout and re-tries to use the failover system. If the primary RADIUS failed the complete Authentication process needs more time because of waiting for timeouts....

Cisco achieve this with a "RADIUS Server dead" status.

http://www.cisco.com/c/en/us/products/collateral/ios-nx-os-software/identity-based-networking-servic...

I think EXOS (and maybe EOS) should be enhanced with such a feature.


Regards
Photo of M.Nees

M.Nees, Embassador

  • 9,640 Points 5k badge 2x thumb

Posted 2 years ago

  • 2
Photo of Bastian Sprotte

Bastian Sprotte, Employee

  • 1,660 Points 1k badge 2x thumb
Matthias,

if you change the Radius Algorithm to use both Radius Servers will reduce Radius Request for DOWN radius servers

Slot-1 Stack.1 # configure radius algorithm round-robin

I know we worked on the function to reduce such dead radius calls and switch the main priority to the working radius. Let me check where we are with that. as this make the function you asking for.

regards
Bastian
-
Photo of M.Nees

M.Nees, Embassador

  • 9,640 Points 5k badge 2x thumb
Hi Bastian,

thanks for clarifying and (hopefully) help to getting this useful feature ...

Unfortunately RADIUS round-robin feature are only available on recent EXOS and EOS (S/K-Series).  SecureStack (B5/C5) does not provide this feature.


Regards
Photo of Erik Auerswald

Erik Auerswald, Embassador

  • 13,782 Points 10k badge 2x thumb
Hi,

having suffered (as a user) through a 4h network outage caused by using Cisco's RADIUS Dead setting, I'd say this is a problematic feature. In this case all RADIUS servers were unavailable for a few minutes due to a network outage, resulting in the switches marking them as Dead. The switches then did not try to reach any of the configured RADIUS servers for the configured dead time of 4h. Since the network used dot1X. Because of re-authentication timers expiring, network access was lost for basically everybody.

To avoid this failure case, an implementation of a RADIUS Dead feature should ignore the Dead status if all RADIUS servers are lost.

Best regards,
Erik
Photo of David Froehlich

David Froehlich

  • 252 Points 250 badge 2x thumb
That's an interesting scenario Erik pointed out which should be considered. In a Cisco environment you should most likely use multiple radius server groups in aaa config to avoid this problem - one with dead timers and a fallback group without dead timers. But that's more or less only theorie. In XOS we neither have dead timers nor server groups (as well as a few other neat features).