vrrp master change without known reason

  • 0
  • 1
  • Problem
  • Updated 1 year ago
  • Solved
Hi everybody,

I have a problem with VRRP on my network.
I have two x670 (both on version 15.5.3.4), I configure them for about 4 VRRP instances.
The first one have a priority of 150 and a preempt delay of 180 sec. The second one have a priority of 100 and no preempt.
I configure VRRP tracking based on ping, and vlan on the first x670.

Since a few weeks, I often receives alerts about VRRP master change with no reason...
The second x670 become master for about 186 seconds (including preempt timer)
I add events VRRP.TrackConditionFailed but didn't find any entry in my logs.
The only entry in logs for this event is something like :
09/15/2017 11:34:15.06 <Noti:VRRP.StateChng> VLAN interco vrid 9: transitioning to MASTER(2)
09/15/2017 11:34:15.05 <Noti:VRRP.StateChng> VLAN examed vrid 43: transitioning to MASTER(2)
09/15/2017 11:31:15.08 <Noti:VRRP.StrtPreemptTimer> VR on VLAN interco with VR Id 9 has started its preempt timer of 180 seconds
09/15/2017 11:31:15.02 <Noti:VRRP.StrtPreemptTimer> VR on VLAN examed with VR Id 43 has started its preempt timer of 180 seconds
09/15/2017 11:31:14.40 <Noti:VRRP.StateChng> VLAN interco vrid 9: transitioning to BACKUP(1)
09/15/2017 11:31:14.40 <Noti:VRRP.StateChng> VLAN examed vrid 43: transitioning to BACKUP(1)
09/15/2017 11:31:13.46 <Noti:VRRP.StateChng> VLAN interco vrid 9: transitioning to INIT(0)
09/15/2017 11:31:13.42 <Noti:VRRP.StateChng> VLAN examed vrid 43: transitioning to INIT(0)
NetSight sends me email alerts with this element :
Message: VRRP master change: Master IP A.B.C.D Reason <unknown>

Anybody have an idea about what I can do to get more information about these events ?

Best regards,
Romain M.
Photo of Romain Mercier

Romain Mercier

  • 432 Points 250 badge 2x thumb

Posted 1 year ago

  • 0
  • 1
Photo of simon bingham

simon bingham

  • 1,228 Points 1k badge 2x thumb

I have had to do the same in the past, I hope this is helpful to you. these are my notes.

Don't forget to turn off log debug mode when done, I warns you about CPU but I have not had any issues turning this on


Send detailed VRRP debug data and normal INFO level information for everything to a syslog server.

# create the filters "VRRPEVENTS" will represent VRRP events in detail an ALSO the usual info,and above log messages

create log filter VRRPEVENTS
configure log filter VRRPEVENTS add events All severity info
configure log filter VRRPEVENTS add events VRRP severity debug-verbose

# Setup SYSLOG

configure syslog add 172.27.233.199:514 vr VR-Mgmt local0
configure log target syslog 172.27.233.199:514 vr VR-Mgmt local0 filter VRRPEVENTS severity Debug-Verbose
enable log target syslog 172.27.233.199:514 vr VR-Mgmt local0

# run debug logging
enable log debug mode

Photo of Kawawa

Kawawa, GTAC

  • 3,292 Points 3k badge 2x thumb
A VRRP failover will only happen if the backup does not hear from the master in the configured timers + preempt value.  I would configure counters at the egress port of the master and in the ingress port of the backup and check them once you get an alert. Or, narrowing in on  what Simon stated above, create a log filter and add only the VRRP EMS conditions:

configure log filter "DefaultFilter" add events VRRP severity debug-verbose
enable log debug-mode
The conditions of interest are VRRP.Advert.Rcv, VRRP.Advert.Accept, VRRP.Advert.Trace, the StateChg is more of a result than a trigger so is not very useful in this regard.

You can then look at the time-stamps of the Hellos right before the failure and try and see if 1) the Sent/received add up and 2)_ what the Trace tells about the events right before the failover.
Photo of Romain Mercier

Romain Mercier

  • 432 Points 250 badge 2x thumb
Thank you for your help and sorry for the late answer.

I'm trying both the counter and the trace.
For the counter I place an ACL on the egress of each vlan in which there is a VRRP instance. I assume there is no other VRRP instance in each vlan. I place it on egress of vlan because I already have an ingress ACL on the port interconnecting the devices.  I do the same on the other device but on ingress.

Here is the content of the ACL :
entry VRRP-advert {
if match all {
    protocol 112 ;
}
then {
    permit  ;
    count VRRP-advert ;
}
}


I've not the same values between the two devices, the values on the "backup" device is about a half of the values on the "master" device.

In traces, at hte beginning of the transition, I found this in the logs of the master (most recent line first):
VRRP:  VR on VLAN examed with VR Id 43 has started its preempt timer of 180 seconds
VRRP:  Registered VRRP on vlan "interco"
VRRP:  Enabling VR, initialized as BACKUP due to priority of less than 255
VRRP.Advert:  Advertisement received on vlan Uangers-RIE vrid 120 from 100.66.129.172
VRRP:  membership to mcast added, if=0x28
VRRP:  vrrp_track_adjust_vr_state VR Uangers-RIE
VRRP.Advert:  advert internally queued for transmission
VRRP:  vrrp_vlan_enable Setting/Unset vlan flags: 0
VRRP.Advert:  advert handed to kernel for transmission
VRRP:  VLAN examed vrid 43: transitioning to INIT(0)
VRRP.Advert:  advert handed to kernel for transmission

I have no idea why the VRRP transitionned to INIT. It's likely the "master" device force the "backup" device to become MASTER.

Any idea ?

Best regards,
Romain M.
Photo of Romain Mercier

Romain Mercier

  • 432 Points 250 badge 2x thumb
Due to some doubts about VRRP track-ping from the following lines of log, I've removed the configured track-ping. No more VRRP transition occurred for the last hour.

VRRP:  Entering function: vrrp_dm_checkpoint_ping_track primaryMSM:1 dmIsCheckpointingON:0 config->vrrpCheckpointingEnabled:1
VRRP.Advert:  advert handed to kernel for transmission
VRRP.Advert:  advert internally queued for transmission
VRRP.Advert:  VR on vlan v6interco with vrid 60: copied 1 VIPs
VRRP.Advert:  VR on vlan v6interco with vrid 60: v3 advert length = 80
VRRP.Advert:  advert handed to kernel for transmission
VRRP.Advert:  advert handed to kernel for transmission
VRRP.Advert:  advert internally queued for transmission
VRRP.Advert:  VR on vlan Uangers-RIE with vrid 120: copied 1 VIPs
VRRP.Advert:  VR on vlan Uangers-RIE with vrid 120: advert length = 40
VRRP.Advert:  advert internally queued for transmission
VRRP.Advert:  VR on vlan Uangers-RIE with vrid 120: copied 1 VIPs
VRRP.Advert:  VR on vlan Uangers-RIE with vrid 120: v3 advert length = 32
VRRP:  membership to mcast deleted
VRRP:  VRRP has requested the virtual MAC 00:00:5E:00:01:09 be deleted
VRRP.Advert:  advert handed to kernel for transmission
VRRP.Advert:  advert handed to kernel for transmission
VRRP.Advert:  advert internally queued for transmission
VRRP.Advert:  VR on vlan interco with vrid 9: copied 1 VIPs
VRRP.Advert:  VR on vlan interco with vrid 9: advert length = 40
VRRP.Advert:  advert internally queued for transmission
VRRP.Advert:  VR on vlan interco with vrid 9: copied 1 VIPs
VRRP.Advert:  VR on vlan interco with vrid 9: v3 advert length = 32
VRRP:  VLAN interco vrid 9: transitioning to INIT(0)
VRRP:  VLAN interco vrid 9: transitioning to INIT(0)
VRRP:  vrrp_vlan_enable Setting/Unset vlan flags: 0
VRRP:  De-registered VRRP on vlan instance 1000009
VRRP:  vrrp_track_adjust_vr_state VR interco...calling vrrp_vr_disable()
VRRP:  vrrp_track_adjust_vr_state VR interco
VRRP:  Entering function: vrrp_dm_checkpoint_ping_track primaryMSM:1 dmIsCheckpointingON:0 config->vrrpCheckpointingEnabled:1
Photo of Romain Mercier

Romain Mercier

  • 432 Points 250 badge 2x thumb
No more VRRP transition occurred since I removed the track-ping.

My track-ping configurtion included 3 missed ping with a frequency of 1s. This configuration was placed about 5 years ago.
Does anybody the time the switch is waiting between its request and the answer to consider the ping is lost ?

Kind regards,
Romain M.