LAG sharing packet loss and high latency

  • 1
  • 1
  • Problem
  • Updated 9 months ago
  • Solved
I have 2 swtiches X670G2-48x-4q, both are working correctly with a LAG (2 ports), version is EXOS 16.1.3.6-patch1-9

Switch1:

enable sharing 10 grouping 10,14 algorithm address-based custom lacp

Switch2:

enable sharing 2 grouping 2,15 algorithm address-based custom lacp


When i add a third and fourth link to the LAG in production, like this: 

Switch1:

Configure sharing 10 add port 37,38

Switch2:

Configure sharing 2 add port 9,17


I can see packets losts and high latency.

What could be the issue?, should i disable the sharing, and enable it with the new ports?
Photo of Jhonn Bejar

Jhonn Bejar

  • 320 Points 250 badge 2x thumb

Posted 1 year ago

  • 1
  • 1
Photo of Jeremy

Jeremy, Embassador

  • 9,788 Points 5k badge 2x thumb
What do you have configured for your custom LACP config? 
Photo of Jhonn Bejar

Jhonn Bejar

  • 320 Points 250 badge 2x thumb
I havent made any change in its configuration, I mean it's configured by default 
Photo of EtherMAN

EtherMAN, Embassador

  • 7,200 Points 5k badge 2x thumb
Couple more questions

Type of traffic running across the links

What are the current utilization of the 2 port lag

one gig or ten gig links

Are those links also a member of others services such as EAPS Spanning Tree ect 
(Edited)
Photo of Jhonn Bejar

Jhonn Bejar

  • 320 Points 250 badge 2x thumb
There are vlans and vmans configured in the links, and we passed video, voice and date traffic.  They are ports of TenG, he current utilization is 2.5G aprox in both direction. They're are members of EAPS and EAPS shared-ports.
Photo of Karthik Mohandoss

Karthik Mohandoss, Employee

  • 6,058 Points 5k badge 2x thumb
Hi Jhonn,

Any link errors in the newly added ports?

"show port <port #> rxerrors"
"show port <port #> txerrors"

Does the "show sharing" indicate that the newly ports are added to aggregation?

One of the quick remedy i can think of is to take a maintenance window disable and enable the sharing with all intended ports at one go... 
Photo of Jhonn Bejar

Jhonn Bejar

  • 320 Points 250 badge 2x thumb
Hi Kartik, No rxerror nor txerrors. The show sharing shows that the newly ports are added to aggregation.  There was congestion on the new port in both sides.   About your remedy its necessary to do that in both sides (switches)?
Photo of Ram

Ram, Employee

  • 1,450 Points 1k badge 2x thumb
Yes, you can try it on both switches. Prior doing it try to execute the command "clear counter" (it will clear the counter stats), then disable and re-enable the sharing. After re-configuring it, check the sharing status and use the command "show lacp counters" to verify the sharing PDU's are exchanged correctly and other parameters are correct.
Photo of Karthik Mohandoss

Karthik Mohandoss, Employee

  • 6,058 Points 5k badge 2x thumb
Hi Jhonn,

Be wary when you are doing these changes. 
First disable the ports and then disable..enable the sharing.
Photo of Edison

Edison

  • 512 Points 500 badge 2x thumb
Hi all,  I work with Jhonn at the same company, and we have disabled the sharing in both sides like you recommended and create it with all 4 ports , but we still had packet loss when the sharing with the 4 ports was formed so we had to disable this two new ports and remove them from the sharing.

The sharing was formed correctlty 
 
* MIR-EXT_AGREG.16 # show sharing 
Load Sharing Monitor
Config    Current Agg     Min     Ld Share    Ld Share  Agg   Link   Link Up
Master    Master  Control Active  Algorithm   Group     Mbr   State  Transitions
========================================================
    10     10     LACP       1     custom      10         Y      A        2
                                   custom      14         Y      A        2
                                   custom      37         Y      A        4
                                   custom      38         Y      A        2
   

No txerrrors or rxerrors in new ports, but we noticed again there was congestion increasing in both new ports, in old ports there isnt any congestion. Theres no log that show any signal of what happened.

Could be a problem of the firmware?
(Edited)
Photo of Ram

Ram, Employee

  • 1,450 Points 1k badge 2x thumb
I understand the port congestion is still seen. Please refer the below mentioned knowledge base article for your reference:
https://gtacknowledge.extremenetworks.com/articles/Solution/Prevent-packet-drops
 
https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-identifying-microburst-congestion-with-wireshark

https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-verify-if-the-LACP-PDUs-are-received-from-the-peer-device

Verify the above provided information and if this issue cannot be isolated. Could you please open a GTAC case to proceed further? So GTAC engineer can troubleshoot the congestion issue via remote session.
Photo of Alexander Shikov

Alexander Shikov

  • 60 Points
We have exactly the same problem.
Was you lucky to find out the cause?
Photo of Alexandr P

Alexandr P, Embassador

  • 12,628 Points 10k badge 2x thumb
Hello, Alexander!

If you has this issue and has valid service contract on your switches, you can wright to your local distributor about that.
:) 

Thank you!
Photo of Alexandr P

Alexandr P, Embassador

  • 12,670 Points 10k badge 2x thumb
Hello, all!

As Ram mentioned earlier, first of all you have to check microburst in your traffic.
Also reconfigure shared-packet-buffer on ports (because X670 has 10G ports and as higher the speed of the port, the more the microburst can be present on it).

Thank you!