Running different versions in a VRRP/MLAG cluster?

  • 0
  • 1
  • Question
  • Updated 2 years ago
  • Answered
We're running 15.6.2.12-nopatch, and we've been getting some odd random reboots and ospf issues, so we're looking to move to a new version i guess latest recommended is 15.6.4.2-patchLatest

We run Sets of x670-G2 and x670's in MLAG clusters with VRRP between pairs of them with OSPF. 

My question is will i run into any issues if i do the upgrade above on 1 unit, wait a few days make sure all is stable them do the other one? Will this give me a hitless upgrade?
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb

Posted 2 years ago

  • 0
  • 1
Photo of Patrick Voss

Patrick Voss, Alum

  • 11,714 Points 10k badge 2x thumb
Hello Chris,

If you a referring to a stack you will run into issues if you attempt this. A hitless upgrade is never recommended. If they are MLAG peers then you can do this but I am not sure if that is the correct approach considering two switches that are virtually one l2 device may behave differently. Would you elaborate on the issues you have been facing and provide a network topology (including the models)?
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
No no stack, just side by side MLAG'd peers that run VRRP+OSPF for our network, they're connected to seperate set of MLAG'd peers that we use for aggregation. l

Issue we've noticed very low uptime on our switches and from the look of it they randomly restarted, one restarted friday though i'm not sure why theirs no errors in the logs or anything that i can find. 

Will take that part up with TAC once we get our contract fixed.  

The other issue is something we experienced before, OSPF seems to get eaten from some sites, with 1 side just getting stuck we were told it was a bug from an old version, but it seems to still be happening to us on this version we're running we're hoping its fixed between .2 and .4.
Photo of Patrick Voss

Patrick Voss, Alum

  • 11,714 Points 10k badge 2x thumb
Would you explain "eaten from some sites"?
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
HEHE https://gtacknowledge.extremenetworks.com/articles/Solution/OSPF-state-is-stuck-in-Init

or similar to that, though disable igmp snooping didn't work around the issue when we tried it so might be a different issue, but basically same symptoms, 1 side shows as stuck in INIT, other neighbors show no neighbors
Photo of Henrique

Henrique, Employee

  • 10,342 Points 10k badge 2x thumb
Hi Chris, regarding MLAG/VRRP you can upgrade one stack (MLAG peer) and then after the communication is reestablished upgrade the other MLAG peer. During this EXOS version transition you should not experience any network issue (except for the stack that will have to reboot).

It's not recommended to run different versions between MLAG peers.
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
ya i figured it wouldnt be recommended, but just wanted to confirm it won't cause a crippling issue for a while while we do it and that it would be hitless
Photo of Henrique

Henrique, Employee

  • 10,342 Points 10k badge 2x thumb
Hi Chris,

Since you will be running same EXOS version (15.6.2.12) between peers I don't believe there will be any issue. However, I cannot guarantee 100%. A good place to take a look would be the release notes to check if anything important has changed between no-patched version and last patch.
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
was going to actually move to 15.6.4 latest instead of 15.6.2 ... i figured higher release = more bugfixes/security fixes ... Theres a lot of release notes between 15.6.2.12 and 15.6.4.2-latest :(
Photo of Drew C.

Drew C., Community Manager

  • 40,826 Points 20k badge 2x thumb
Chris, The 15.6.5.2 Release Notes show that it contains fixes all the way up through 15.6.4.2-patch1-7 (and has more of its own).  It also includes all of the open and resolved issues for previous releases.  That will make it easier to review.  If you don't want to go all the way to 15.6.5.2, the release notes for 15.6.4.2-patch-1-7 work the same way.  You can just Ctrl+F your way though the document to find MLAG mentions.
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
the path architecture between branches seems sooo crazy i mean you would think that any patch with a publish date later than the original patch that had a fix would include the fix from said problem. I think one of our problems with the 15.5 releases is we're hitting a xos bug that causes <Erro:Kern.IPv4Mc.Error> Unable to Add IPmc sender entry s,G,v=x.x.x.x,224.0.0.5,2022 IPMC 8 flags 4 unit 0, Entry exists which i believe, which might be whats affecting ospf... and then we have random reboots on both of our XOS boxes that are OSPF/VRRP routers.

The recommended software list is sorta funny considering it doesn't recommend latest branches, but on top of that it doesn't even recommend latest patch version of those releases.
Photo of Peter

Peter

  • 1,018 Points 1k badge 2x thumb
In XOS 15.5 I had a problem in the past, that mac learning were not correct working, when mlag peers run different 15.5 versions.

I personally would recommend to upgrade the g1 to last 15.7 or 16 and the g2 to XOS 21.


I don't like XOS Version 15.6 and prior. to much trouble. I know, it's recommended for a lot of systems. But I'm running at least XOS 15.7 on all my customers and have no problems.


Last year in july I set up a new core with 2 X670G2 as MLAG Peer with Vrrp on XOS 15.7 at a customer. Until March this year no problems. In March I upgrated to XOS21 with nearly no traffic interrupt.


In the near future I will configure vrrp fabric routing (activ/active).
(Edited)
Photo of Chris

Chris

  • 492 Points 250 badge 2x thumb
Peter thanks for your input ya i'm wondering if i shouldn't just  jump to XOS21... i was worried since it was such a big version jump it would be buggy... but honestly 15.5 seems to be very buggy even though it was the recommended release at the time.