This question comes up a lot, of which myself have been guilty of asking. Was recently asked to define the differences between the two, and in the process wrote the following piece. In sharing it I hope it helps answer some of the queries.
Most of this is in my interpretation, and I have used sources of information from other posts but feel free to correct or append any element.
This article doesn’t go into any detail on the configuration of MLAG or Stacking, as both of these are already well documented, but what this artile intends to do is give an informed explanation of the differences between the two. The pros and cons, and the different approaches to scaling and mixed protocol approaches so that better design decisions can be made.
There are example descriptions on the internet of MLAG (Multi-Chassis Link Aggregation) with other venders been given in-terms describing features like Virtual Switching / Virtual Chassis Bonding etc, which is absolutely not the case in the Extreme Networks sense and will detail the differences between these further on.
MLAG is the ability of switches to appear as a single switch at layer 2, so that bundles of links in the form of LAGs can be diversely connected to each switch and appear as one. LAGs are typically created North & South i.e. between host and switch, whereas MLAG is created and expanded in an East & West direction.
MLAG itself is not standardised, so you wouldn’t necessarily be able to create an MLAG between differing venders, but once the MLAG is created between switches the North & South LAG connection is the same regardless of what’s attaching, exactly the same as you would traditionally have seen and implemented.
MLAG vs Stacking / Bonding / virtual switching etc. is something that comes up a lot, and to understand what the differences are it helps to understand a little what’s going on under the hood.
The term stacking is something you would expect to do at the edge, but it isn’t something you would typically want to associate at the core. To get around this vendors created a stacking like technology for the core like the use of virtual chassis bonding, but this is essentially doing the same thing! The difference is that there are some additional measures used to safeguard the integrity of the chassis bonding, here are a few:
The primary issue with the stacking type approach is the virtual device partitioning, and how it would handle layer 3 problems. In essence the architecture takes an X number of boxes and turns off all but one brain (control plane). You then distribute the switching subsystems, by essentially adding complexity. So now, with one control plane it becomes possible to implement link aggregation across the multiple switches. The primary control plan will receive all control packets and directly controls the switching fabric for the whole system.
In summary it’s having this single control plan that is often in contention when looking at stacking / bonding type approaches at the core, whereas MLAG the control, management and data planes are all separate.
There are though pros and cons to using stacking and /or MLAG, even with the use MLAG and stacking at the same time considered complementary!
4 MLAG vs Stacking Approaches So let’s start by taking two approaches to same problem and decide which might be better. This being two geographically separate sites with the requirement for a pair of core switches in each location for resiliency.
The first is using MLAG:
The second approach is using stacking. The design is similar and simpler, that appeal alone might be a draw for many to use it, but the pros and cons will assist in the overall decision.
As you can see it almost looks the same from a topology perspective as the MLAG approach, but is simply tilted around where the cores forming the stack are across the geographic locations as opposed to being in the same place. You can, like the MLAG approach, do inline upgrades to one stack at a time with this approach without affecting service.
4.1 MLAG Approach
4.2 Stacking Approach
4.3 Which approach is best?
So in this particular scenario it would seem to make sense to use MLAG. The decision could though be that stacking is preferred for other cases, for the following reasons:
So the decision to use one or the other is simply a matter of weighing up the options and understanding your network that would determine the solution.
5 Additional Information
5.1 Extending MLAG
Here is an example where you have an existing MLAG solution but you have run out of ports and need to expand. With MLAG you simply add another MLAG pair in an East & West direction. With stacking you are limited to how far you can expand due to the common control plane - this is normally 2 for bonding, or 8 for stacking. Although stacking you want to try and stick to 4-5 best you can.
5.2 Best Practice OSPF / VRRP / Fabric Routing
Whenever configuring MLAG I always configure OSPF and Fabric Routing mode. OSPF is more relevant if a route might appear on one switch and not another or you have muliple routers in the mix as in the example above. It’s not always necessary in some instances, but think its good practise and eases any future growth of the network to configure it from the start.
Directly in line with OSPF I always enable Fabric routing mode, more detail is given here https://extremeportal.force.com/ExtrArticleDetail?an=000080659, as another best practice feature.
Enabling fabric routing mode provides a kind of active active approach when used in conjunction with VRRP, and by proxy offers optimal efficiency with use with MLAG. Effectively what it’s doing is sharing the VRRP VMAC address between each of the routers so that each router can answer independently to requests sent to the default gateway. The VRRP master will still have to respond to ARP requests, so this traffic could essentially be going over the ISL, but after that the immediate upstream switch / router will be able to directly respond.
5.3 MLAG & Stacking
There are circumstances where stacking and MLAG might make sense when used in conjunction with each other. An example of this is that stacking can stack multiple different switch types together, so in the core you might want one switch with all SFP ports and another with copper. What you could do is stack the two different switch types together and then MLAG them together, as per below:
5.3 Inter-Switch Traffic Flow
In relation to the S/K series below is an example of the traffic flow that might be expected when using stacking / bonding, and as can be observed this is not always optimal. This is in opposite contrast to MLAG, whereas there would be minimal traffic traversing the interlink (ISL).
Fortunately with Extreme Networks VSS there is a configurable option to make this more optimal.
set lacp outportLocalPreference [none | weak | strong | all-local]
None = Do not prefer LAG ports based on chasis Weak = Use a weak preference towards ports on the local chassis Strong = Use a strong preference towards ports on the local chassis
All-local = Force all packets onto local chassis ports, if possible