Header Only - DO NOT REMOVE - Extreme Networks

MLAG setup - looks like hitting a L2 loop

  • 29 July 2019
  • 9 replies


We tried to set up a MLAG between 2 x670 switches and once we enabled the second "leg" (port 41 on sw1) looks like we did hit a loop. Unfortunately it's a production network and we are very limited in opportunities to reproduce it.

MLAG related configs are as follows:

create mlag peer "sw2" 
configure mlag peer "sw2" ipaddress vr VR-Default
enable mlag port 41 peer "sw2" id 202
enable sharing 41 grouping 41-48 algorithm address-based L2 lacp

create mlag peer "sw1" 
configure mlag peer "sw1" ipaddress vr VR-Default
enable mlag port 41 peer "sw1" id 202
enable sharing 41 grouping 37-48 algorithm address-based L2 lacp

MLAG peers see each other, checkpoint status is 'Up'. What caught my attention is this. On sw1:

sw1.118 # debug hal show vsm 

VSM Blocking Filters:
Ingress port: 1:1
Blocked ports:
Unit 1 (inst 1 Fid A553 l3_inst 1 l3_Fid A551 l3rem_inst 1 l3rem_Fid A552 pend 0):
41 42 43 44 45 46 47 48

VSM Redirection: (Enabled)

But on sw2:

sw2.29 # debug hal show vsm 

VSM Blocking Filters:
Ingress port: 1:1
Blocked ports:

VSM Redirection: (Enabled)

Could this be the cause of the problem (that there're no blocked ports for the filter)? If so, why they could've not been added?

Both switches are running

9 replies

Userlevel 3
What does "show mlag peer" and "show mlag port" on both switches show?

Can you draw a network map of what you're trying to accomplish?

You created LAGs using ports 41-48 on one switch, and 37-48 on the other. Are you sure that is correct?

BTW 16.2 is end of service life in December. You should consider upgrading.
Userlevel 3
Hi, we have the same problem with a pair of x870 devices conected to another pair of x690. The firmware version running is

We have cheked mlag configuration running the mlag script and its all ok.

Also we have tested x870 with versión connected to a pair of x690 with and the loop behaviour is produced again (not at the moment but in days).

The next test is upgrading all the 4 switches to
Userlevel 5

Is that a two-tier MLAG design?
If 'show mlag peer' and 'show mlag port' and 'show sharing' are all good, is it possible that the loop is introduced elsewhere in the network (even on non-MLAG-considered VLAN) and hits the switches? You say that the loop doesn't happen at the moment but in days - is that loop happening randomly or was the test performed a while ago?

Kind regards,
Userlevel 3
Did you add the ISC vlan to the ports between the switches?

Userlevel 3
i have used the EXOS MLAG script to test the mlag configuration. The problems (seems like loops) apprears randomly and we dont know why.

We have updated to to ensure that is not a bug. Otherwise we have seen a protocol to view mlag loops in this new version.
Userlevel 5

You might want to try and use ELRP to spot the loop when it happens. Have a look at these:

FYI, with EXOS 30.2 and older ELRP periodic test interval can be as small as 100 ms. With EXOS 30.3, hardware can be used for these tests, which allows to decrease the interval to just few milliseconds.

Hope that helps,
Userlevel 3
finally we think that we have reached the problem. We need test it yet, but Im sure that this is the problem.

We have all devices updated to I have seen that 30.3 fix some mlag bugs. We have seen in logs that the x870 devices dont have enough resources to manage MLAG ACL.

In this post are some information:

We are going to asign ipv6 resources to mac resources to test again the mlag behaviour.

I hope this resolve the problem
Userlevel 3
Does anybody knows how to show acl resources used by mlag?

The command " show policy resource-profile" does nos show any used resource by L2

show policy resource-profile

Current Configured Profile: default
Current Profile Modifier : none

MAC IPv6 IPv4 L2
Rules Rules Rules Rules
----- ----- ----- -----
Max 512 512 512 440
Used 0 0 53 0

Someone have tested the command " configure policy resource-profile more-mac-no-ipv6 " ??
Userlevel 3

Hi, we have resolve the problem.

We upgrade to to resolve a bug in mlag with arp patckets.

Algo we have disable policy to free resources to acls.


At the moment all is working. 


Now we are looking to force mlag to use local links instead peer links to avoid pass traffic through ISC.