01-25-2024 01:18 PM - edited 01-25-2024 01:23 PM
We're going to install two 48 port SFP+ switches at a co-location facility that will house all of our primary servers and storage.
Both x690 switches have the core licence and will do layer 3. Our provider will add our 10 gig ring to our rack and normally they handoff one 10gig fiber to us. The conundrum there is one handoff, so if you plug into switch 1 and reboot it, then the sites down. We'll also peer with a palo alto firewall cross connected to both switches and it will have an OSPF higher cost back to another palo alto at our DR facility where it could hop on the 10 gig ring from there. But this is an IPSEC tunnel over the internet, and only about 500mbps, a fraction of the 10gig our metro ethernet provider is giving us.
So our metro-e provider says they can hand us off two links, so we can have one to each core switch. I know I said ring up above but the ring is on their side with some kind of G.xxxx protection. Its completely transparent to us (the WAN side). The LAN side is where they can give us 1 or 2 links. That way we can reboot one core switch and stay online. The servers and iscsi storage would be cross connected to both switches with mlags. But this begs a question... how does the ospf peer and if we do some layer 2 spans briefly so we can storage and host vmotion from one datacenter to another, wouldnt that create a loop?
The other thing we are thinking instead of two seperate core switches, combine them both into one stack. But the quesiton here is how do we do code upgrades? We would still have everything cross connected, but instead of an mlag it would just be enable sharing (lacp trunks) from ports 1:1 and 2:1, 1:2 and 2:2, etc...
Then you have one routing brain for ospf. But if we want to reboot one switch at a time, would you lose management? Can you even upload new exos versions and just reboot one slot, ensure it comes back on and then reboot the other slot? This is the functionality we've come to think having both switches managed seperately would provide us.
So to stack or to keep them seperate. Whats best for a mixed Layer 2 / Layer 3 environment?
Solved! Go to Solution.
01-28-2024 11:04 AM
I would run them separately and use MLAG for dual homing between any local devices you want to plug in.
If your provider if able to offer layer 2 between the two links, you could either run LACP if available, or perhaps EAPS for automated failover (Have done this over very long distances, worked flawlessly)
This will allow you to reboot one switch at a time for maintenance.
You can run seperate OSPF routing engines between the two (It's been many years since I have done this, having moved everything to Fabric)
01-28-2024 11:04 AM
I would run them separately and use MLAG for dual homing between any local devices you want to plug in.
If your provider if able to offer layer 2 between the two links, you could either run LACP if available, or perhaps EAPS for automated failover (Have done this over very long distances, worked flawlessly)
This will allow you to reboot one switch at a time for maintenance.
You can run seperate OSPF routing engines between the two (It's been many years since I have done this, having moved everything to Fabric)
01-31-2024 03:12 AM
@Brent_Addis wrote:
- ICL (Inter-Chassis Link): In Extreme Networks terminology, the direct connection between MLAG peers is called an ICL. It's used to forward traffic that is received on one MLAG peer but needs to be transmitted out of a port on the other MLAG peer.
It's called ISC, Inter-Switch Connection. 🙂
01-31-2024 11:22 AM
LOL you're right! I keep on getting up early and writing these without having my morning coffee!
01-29-2024 02:10 PM
Were also testing the mlag'd sepereate switch scenerio but if I'm logged into switch 2 and we reboot switch 1, I loose connection to switch 2. That shouldnt be the case.
Both are mlaged to another switch stack and using lacp trunking so that switch stack appears as 1 link.
Both switch 1 and switch 2 are OSPF peering, but yeah I lose connection to both of them no matter which one I reboot - and its not OSPF dead timer... we lose it until the other switch comes back.
Any idea what's happening here? Doesn't seem any better than the 20 second loss in connectivity when rebooting one slot in a stack