The asynchronous forwarding nature inherent in an MLAG with VRRP topology is well known and well tested. In a highly-meshed design, the number of actively forwarding pathways can be numerous and difficult to troubleshoot. In the topology shown below, the “LAN Core” consists of 4 switches in a cluster of 4 VRRP routers and 2 MLAG pairs. Host VLANs exist only amongst ToR Stacks and this LAN core. As the configuration is MSTR/BKUP/BKUP/BKUP, only one forwarder will be active at any time.
The routing configuration is static between the “Upstream routers” and the “LAN Core” routers, protected by BFD sessions. This is done to reduce convergence time to the lowest possible. The upstream routers learn the rest of their routes via OSPF. There are two static routes on each upstream router for each host VLAN; the higher priority gateway is the VIP of the VRRP cluster and the lower priority is to the other upstream router. Testing showed that this design, paired with BFD provided very fast and reliable forwarding under any reasonable (where at least one pathway existed) circumstance.
Extensive testing showed very positive results with just one scenario failed. If either of the ISC links went completely down, we saw intermittent loss for some North-South flows. East-West flows were unaffected. When investigating the root cause, we determined that as the upstream router was unaware of the loss of the ISC, it was continuing to load balance packets sourced in “the rest of network” destined for the host VLANs over the two ports connected to the VRRP cluster. When the BKUP router in the cluster received a packet destined for a host VLAN and had no (viable) pathway to the VRRP MSTR, then it would drop the packet. Loss was intermittent due to the hashing at the upstream router only sending some flows down the physical port that would ultimately be a dead-end. When forwarding on the opposite side, there was no effect as the opposite pathway would always resolve to the MSTR and would be forwarded.
Several solutions were considered, such as:
- Adding redundant pathways for the ISC
- Convert to a dynamic routing protocol throughout
- Convert to Active/Active VRRP
We considered other events (link down, LACP, etc) as triggers but the reliable event was the peer message. We also attempted to “disable bfd” to gracefully bring the session down; this did not work as EXOS took no action when the bdf socket was ended by a close message. We changed to simply blocking UDP port 3784 at the MLAG-Peers and this did the trick.
The very simple configuration is shown below:
# Module acl configuration.
create access-list disbfd " protocol udp ; destination-port 3784 ;" " deny ;" application "Cli"
# Module ems configuration.
create log filter upm_ISC_LinkDown
create log filter upm_ISC_LinkUp
configure log filter upm_ISC_LinkDown add events VSM.RmtMLAGPeerDown
configure log filter upm_ISC_LinkUp add events VSM.RmtMLAGPeerUp
create log target upm disable_bfd2BB
enable log target upm disable_bfd2BB
configure log target upm disable_bfd2BB filter upm_ISC_LinkDown severity Info
create log target upm enable_bfd2BB
enable log target upm enable_bfd2BB
configure log target upm enable_bfd2BB filter upm_ISC_LinkUp severity Info
# Module upm configuration.
create upm profile disable_bfd2BB
conf access-list add disbfd last priority 0 zone SYSTEM vlan LAN-Interco-1 egress
create upm profile enable_bfd2BB
conf access-list del disbfd all[/code] Please reply with questions and comments. I would like to refine this to a knowledgeable article after it is clarified to a point of simple understanding.
- Mike Lane, SE in Bavaria