cancel
Showing results for 
Search instead for 
Did you mean: 

VRRP Stuck in INIT State

VRRP Stuck in INIT State

Erik_Auerswald
Contributor II
Hi,

when trying to add another instance of VRRP to a switch (stack of two X670-G2, EXOS 16.1.3.6-patch1-8), VRRP for that VLAN stays in state INIT.

After enabling VRRP for the VLAN, a message is printed that VRRP will only become active with active ports in the VLAN:
VRRP Vrid 1 will be enabled when VLAN VL-DATA-172-18-2 obtains an active portBut the VLAN is active, I see MAC addresses in this VLAN (show fdb vlan VL-DATA-172-18-2), and I can ping the VLAN's IP address. The interface is advertised in OSPF (I see it in the LSDB of another router in the area).

Other VRRP instances in different VLANs on the switch are active, those are just two, so it is not a question of the number of VRRP instances.

Does anybody have an idea how to debug this? Or perhaps even a possible solution?

Thanks,
Erik
9 REPLIES 9

Erik_Auerswald
Contributor II
Hi,

I'm sorry, but we had to reboot the switch to make progress this evening (in Europe). 😞

Anyway, routing was fine, I could ping all switch IP addresses from my notebook, and from the neighbor switch with the active VRRP address.

The VLAN I wanted to start VRRP on was shown as directly connected, route up (U) and installed in forwarding table (f). I did see ARP entries in this VLAN. I could ping IP addresses in this VLAN from the switch.

The VRRP is run via an EAPS ring. The switch is a transit node for the EAPS ring in question. The EAPS ring was complete. There are additonal EAPS domains on that switch.

The VRRP MAC was shown in the FDB on the port it had to be. I could ping the virtual and physical addresses of the neighboring VRRP peer, and that switch could ping the physical address of the problem stack in the problematic VLAN. The neighbor switches showed the correct MAC address on the port as well.

The VLAN showed two active ports, the two ring ports.

The reboot proved problematic, too. The switch lost its RADIUS configuration, and it blocked the secondary port on a complete EAPS ring it is transit node for. Disabling a different EAPS domain, transit node as well, sharing the same ports but no VLANs, with that ring incomplete, unblocked the secondary port in the complete EAPS ring.

The situation is stable, I will probably open a case (or two) tomorrow. I have saved a show tech-support from before the reboot.

If you have any ideas or speculations what could have caused the problem (any of those described above), feel free to post them.

Thanks again,
Erik

Hi Simon,

thanks for your input. At another customer VRRP passwords were introduced to prevent servers using VRRP to interfere with the routers.

Different VRIDs per VLAN are often not possible with EXOS, because of the limited number available there. 😞

Thanks,
Erik

This is probably not the issue but just be aware of this, I tend if possible to use a VRID other than 1, why ?. In the past I have found other devices that you might not expect, use VRRP or VRRP -like protocols. one example I came across years ago was 2 Linux servers running VRRP ( they did not call it VRRP ) the server admins were unaware this was VRRP, but it was using the same MAC this was playing havoc because it was using the same MAC as the VRRP gateway.
Cisco even recommends in some literature you use a different VRRP VRID for each vlan. although not really necessary this does have the advantage that the MAC address for the default gateway will be unique for every vlan. So if some muppet loops your vlans together at the edge, the outcome can be more stable before you work out what has happened. also its likely you will have a firewall somewhere also running VRRP on one of your VLANS.

I forgot to mention: Without changing anything of the VRRP configuration, the VRRP problem went away after the reboot (after resolving the EAPS issue).

It seems as if the VLAN up indicator used for VRRP was not in sync with the rest of the switch. VRRP should just have looked at the show VLAN output. 
GTM-P2G8KFN