Extreme Networks

Keith9 · ‎01-25-2024

We're going to install two 48 port SFP+ switches at a co-location facility that will house all of our primary servers and storage.

Both x690 switches have the core licence and will do layer 3. Our provider will add our 10 gig ring to our rack and normally they handoff one 10gig fiber to us. The conundrum there is one handoff, so if you plug into switch 1 and reboot it, then the sites down. We'll also peer with a palo alto firewall cross connected to both switches and it will have an OSPF higher cost back to another palo alto at our DR facility where it could hop on the 10 gig ring from there. But this is an IPSEC tunnel over the internet, and only about 500mbps, a fraction of the 10gig our metro ethernet provider is giving us.

So our metro-e provider says they can hand us off two links, so we can have one to each core switch. I know I said ring up above but the ring is on their side with some kind of G.xxxx protection. Its completely transparent to us (the WAN side). The LAN side is where they can give us 1 or 2 links. That way we can reboot one core switch and stay online. The servers and iscsi storage would be cross connected to both switches with mlags. But this begs a question... how does the ospf peer and if we do some layer 2 spans briefly so we can storage and host vmotion from one datacenter to another, wouldnt that create a loop?

The other thing we are thinking instead of two seperate core switches, combine them both into one stack. But the quesiton here is how do we do code upgrades? We would still have everything cross connected, but instead of an mlag it would just be enable sharing (lacp trunks) from ports 1:1 and 2:1, 1:2 and 2:2, etc...

Then you have one routing brain for ospf. But if we want to reboot one switch at a time, would you lose management? Can you even upload new exos versions and just reboot one slot, ensure it comes back on and then reboot the other slot? This is the functionality we've come to think having both switches managed seperately would provide us.

So to stack or to keep them seperate. Whats best for a mixed Layer 2 / Layer 3 environment?

Brent_Addis · ‎01-28-2024

I would run them separately and use MLAG for dual homing between any local devices you want to plug in.

If your provider if able to offer layer 2 between the two links, you could either run LACP if available, or perhaps EAPS for automated failover (Have done this over very long distances, worked flawlessly)

This will allow you to reboot one switch at a time for maintenance.

You can run seperate OSPF routing engines between the two (It's been many years since I have done this, having moved everything to Fabric)

1. MLAG Basics:

MLAG Peer Link: The switches in an MLAG pair are interconnected by a peer link. This link is used to synchronize state between the switches and forward traffic from one switch to the other when needed.
ICL (Inter-Chassis Link): In Extreme Networks terminology, the direct connection between MLAG peers is called an ICL. It's used to forward traffic that is received on one MLAG peer but needs to be transmitted out of a port on the other MLAG peer.

2. OSPF in an MLAG Environment:

Single Control Plane: Even though there are two physical switches, MLAG makes them appear as a single logical switch from a Layer 2 perspective. However, for Layer 3 protocols like OSPF, each switch still operates independently.
Independent OSPF Instances: Each switch runs its own instance of OSPF. They don't share OSPF information directly with each other; instead, they interact with other routers in the network individually.
Redundancy: Each switch establishes its own neighbor relationships and participates in OSPF as an individual router. This provides redundancy because if one switch fails, the other can continue to forward traffic.
MLAG Ports and OSPF: If an OSPF neighbor is connected to an MLAG port (a port that is part of a multi-chassis link aggregation group), the OSPF process sees this as a single connection due to the MLAG configuration, even though the neighbor can physically connect to two different switches.

3. Configuration Considerations:

Consistent Configuration: Ensure that OSPF is configured consistently on both MLAG peers. Any mismatch in OSPF configurations can lead to suboptimal routing or routing loops.
IP Addressing: Each switch should have its own unique router ID and interface IP addresses. However, they can share a virtual IP address for the MLAG VLAN interfaces.
OSPF over ICL: It's typically not recommended to run OSPF directly over the ICL. OSPF adjacencies should be formed over front-panel ports, not over the ICL, to ensure optimal traffic flow and to avoid any unnecessary traffic on the ICL.

4. Behavior During Failures:

Link Failures: If a link fails, OSPF on the affected switch will detect the failure and reroute traffic according to the OSPF topology. MLAG ensures that the failure is isolated and doesn't affect the other links or the peer switch.
Switch Failures: If one of the MLAG switches fails, the other switch will continue to operate, handling both the traffic that was destined for the failed switch and the traffic for its own ports. OSPF will reconverge, and routes will be updated across the network.

5. Monitoring and Troubleshooting:

OSPF Neighborships: Check OSPF neighbor states to ensure that adjacencies are forming correctly.
Routing Tables: Verify that routing tables are correct and that routes are properly learned and advertised by OSPF.
ICL Health: Monitor the health and status of the ICL, as it's crucial for the proper operation of the MLAG pair.

-----
-Brent Addis / Extreme Black Belt #491

New to Extreme? Check out the Welcome series here - https://training.extremenetworks.com/welcome-series-1
Want to join the official Extreme learners discord? Let me know!

View solution in original post

Keith9 · ‎01-30-2024

Ah you know why we thought the switch was going down when we reboot one? Becuase the ONLY port in Vlan Default, tag 110 was the MLAG ISC (sharing) port 65,69. So when the other switch went down, there were 0 active ports in this vlan, so OSPF didnt advertise. I realize it when I could ssh into the surviving switch from an adjancent switch on its transport network.

enable loopback-mode vlan Default on both switches resolved this.

This is a lab setup so theres not much plugged in. We (incorrectly) assumed that since both switches had ip addresses, vlan forwarding and VRRP on the vlan "Default" tag 110 was enough... but it wasn't. Normally this wouldnt happen becuase in an operating enviroment there would be tons of devices plugged into these face ports causing many ports to be "active".

Keith9 · ‎01-29-2024

We are going to go with two independant switches.

Im just concerned about loops. Our transport vlan is 102 and accessible by all sites on the ring, a /24. OSPF peers with them to get to their individual site voice and data lans. And in the sake of HQ, vlans for each floor of the building. OSPF is great in that regard. Attached is what we're looking to do.

The new data center will become our HQ at some point, but we have to move servers and storage over there in time. In the meantime I'm hoping we can use vmware vmotion to move machines from servers at HQ to the DC with layer 2 spans. It is 10 gig and unaltered, so if we want to enable jumbo frames we can. Provider says they top out at like 9200 mtu. But with this if we had a single link at the new DC I don't think I'd worry about a loop. But the fact is they can provide two links that means there could be a loop. Say between vlan 1 (production), vlan 10 (san storage), vlan 111(vmotion) for sure.. even vlan 102 (ospf transport). The new DC will have its own networks OSPF from that vlan 102 transport and eventually servers that move there will be re-ip'd and put into that vlan, which will go out the firewalls there instead of the firewalls at HQ.

I know this sounds complicated so I drafted up a diagram. I'm just worred that tagging these vlans 1,10,20,24,102,111, etc. on both x690s at the new DC facing the ISP and also on the 200gb switch to switch link - would cause loops.

Brent_Addis · ‎01-30-2024

Extreme have EAPS for removing looping as an issue.

You'll need a L2 VLAN around the ring with sends heartbeat packets. It'll keep one link down as a backup, and in the event of failure (IE heartbeat packet failure) it'll bring the other link up.

It works incredibly well, I have around a dozen customers with it.

https://documentation.extremenetworks.com/exos_31.7/GUID-F9D618E3-26A1-4AB5-A07F-06C2609DFB96.shtml