packet loss on summit 670 connected with LAG, MLAG ?

  • 0
  • 1
  • Question
  • Updated 3 years ago
  • Answered
Hi,

Strange problem with packets loss. I have LAG on port 47-48 connected to 2 different switches with MLAG. I see problem only between pair:

src mac  f0:1f:af:d7:b5:37  and dst  mac  00:18:9d:08:c8:59 rest of traffic are ok. 

Summit sees in fdb table dst mac on port 47 but he sends via port 48 ( according to counters on port 47,48).. On next switch (uplink, also summit) i can't see any packets ( checked by port mirroring and ingress counters)

If i disable port 48 all works correct. If i change src mac address on server - all works also correct. 

There is  some well known bug in hashing in LAG/MLAG in 15.3.1.4 v1531b4-patch1-18 ?

What has higher precedence when switch have to choose wich port use to send: entry in fdb or hash alghoritm ? 

 some outputs:

 Summit8.48 # show fdb 00:18:9d:08:c8:59

Mac Vlan Age Flags Port / Virtual Port List

------------------------------------------------------------------------------

00:18:9d:08:c8:59 vlan183(0183) 0041 d m 47

 

* Summit8.49 # debug hal show fdb | include 00:18:9d:08:c8:59

00:18:9d:08:c8:59 183 00001021 47 TRUE FALSE

 

* Summit8.65 # show policy trampek

Policies at Policy Server:

Policy: trampek

entry test4 {

if match all {

ethernet-source-address f0:1f:af:d7:b5:37;

}

then {

log ;

count trampek4 ;

permit ;

}

}

Number of clients bound to policy: 1

Client: acl bound once

  

Summit8.62 # show access-list counter ports 47 48 egress

Policy Name Vlan Name Port Direction

Counter Name Packet Count Byte Count

==================================================================

trampek * 47 egress

trampek4 0

trampek * 48 egress

trampek4 10

 

Image : ExtremeXOS version 15.3.1.4 v1531b4-patch1-18

 enable sharing 47 grouping 47-48 algorithm address-based L3_L4 lacp

 

thanks for help,

Pedro

Photo of Pedro

Pedro

  • 322 Points 250 badge 2x thumb

Posted 3 years ago

  • 0
  • 1
Photo of Chad Smith

Chad Smith, Alum

  • 5,660 Points 5k badge 2x thumb
Official Response
Pedro,

The EXOS software version you are running is susceptible to a known defect that can cause traffic forwarding problems:

PD4-4490184741:Traffic loss occurs on switches due to parity errors in L3 table.
With this issue only certain traffic flows will be affected (as you have described.  To confirm this issue a debug session with the GTAC would be required.

PD4-4490184741 was corrected in 15.3.1.4patch1-36.  However, the current recommended software version for the X670 platform is EXOS 15.5.4.2-patch1-5.xos.

 
(Edited)
Photo of Chad Smith

Chad Smith, Alum

  • 5,660 Points 5k badge 2x thumb
Official Response
I have created a GTAC Knowledge article that provides further information on these software defects: Traffic loss due to parity errors