Extreme Networks

Anonymous · ‎12-03-2017

Hi There,

Apologies for this question being a little long....

Just looking into the communities thoughts around some best practices around configuring VRRP.

Preempt

By default the preempt delay is 0 seconds and the preempt to master would therefore be 3 hello's, which are sent every 1 sec. So my question is would a 3 second preempt be deemed sufficient? I've seen some set to 90 seconds, the logic for that is giving the network a chance to stabilise before going to master to stop flapping. Is there a formula you could use, what if you have more than 2 routers in the VRID.

This article with VRRP and FREB shows a prempt delay of 5 seconds:
https://extremeportal.force.com/ExtrArticleDetail?an=000080659

Accept Mode

In EOS I have in the past turned accept-mode on so that you are able to ping the VRRP VIP address, but in EXOS you do not need to do this. So wondering what other practical / best practice reasons there would be for turning it in. One example might be to support NTP over the VIP as per the following GTAC article:

https://extremeportal.force.com/ExtrArticleDetail?an=000081389

Fabric Routing

This was mentioned above, but given its own heading for comment. In that example preempt delay was set to 5 seconds, so just wondering if the inclusion of fabric routing, and even the number of participating routers in the same VRID should be something to consider?

Tracking

VRRP can be tracked via pings, IP routes and VLANs. So there is probably some obvious aspects of when that might be a good idea, but interested in some practical examples and / or best practices. As as an example the GTAC case below shows how to configure VLAN tacking if a VLAN fails so that it will failover to the other one, which sounds great but could that be considered good practice to do that on every VLAN?

https://extremeportal.force.com/ExtrArticleDetail?an=000061651

Host Mobility

An explanation for this is given here:

http://documentation.extremenetworks.com/exos/EXOS_21_1/VRRP/c_vrrp-host-mobility.shtml

I can see this possibly making sense when using fabric routing mode and when multiple routers are in the same VRID. In fabric routing mode with MLAG my perception would be that traffic could end up at any switch in the MLAG pair, determined by the hashing algorithm configured on the LAG and then be routed from there. Both routers would essentially be advertising the same subnet so asymmetric routing could take place as traffic could land back at the other router (other switch in MLAG pair). Whether that actually matters though I don't think, because the switch would see the device directly attached through the other link in the LAG and therefore directly forward the request onto the client instead of passing it back to the originating router.

Interested in your thoughts.

Many thanks in advance

Erik_Auerswald · ‎12-04-2017

Hi Martin,

thanks for posting a good question.  I'll try to add my 2¢ to your thoughts.

Preemt Delay: The idea is to give routing protocols etc. a time to start up and converge. This is generally a good idea, but I am not sure if this still helps in combination with fabric routing.
Accept Mode: I would enable that only if the VRRP address is supposed to provide some kind of service. That is usually not the case for switches. With switches or routers it is often better to bind services to an anycast address in combination with dynamic routing.
Fabric Routing: I would generally enable this for symmetric setups, e.g. MLAG. But there might be some time during fail-back when the returning switch will use layer 3 forwarding, but does not yet have a complete routing table, which may result in traffic blackholing. I have never analyzed the exact behavior in that case.
Tracking: Important for optimal traffic flow in asymmetric setups, e.g. with two routers that are connected to one of two core or distribution switches only. Fabric Routing kind of defeats the idea of interface tracking, because the backup router will forward packets anyway.
Host Mobility: That is part of a specific solution to VM mobility across data centers. It can be used if VRRP with Fabric Routing is used to implement an anycast gateway with optimal return path for leaf & spine designs with routing on the leaf switches.

Thanks,
Erik

Erik_Auerswald · ‎12-04-2017

An equivalent of the EOS (S-Series) command vrrp interface-up-delay VRID SECONDS would be a great addition to EXOS, since it addresses just this problem.

Erik_Auerswald · ‎12-04-2017

Yes, that is a useful solution for MLAG setups. For symmetric non-MLAG (e.g. L3) setups Fabric Mode is useful during steady state, but something like a "Fabric Mode Forwarding Delay" for smooth fail-back would be nice. 

Stephane_Grosj1 · ‎12-04-2017

For point 3, since 22.3 and a recent patch of 21.1, we have introduced a "restore timer" for MLAG ports. In the event an MLAG peer restart, the MLAG ports will wait that timer before enabling themselves, to give time to L3 protocols to converge.

Extreme Networks

VRRP best practices, preempt, tracking, fabric routing, accept-mode, host mobility

VRRP best practices, preempt, tracking, fabric routing, accept-mode, host mobility