I am trying to find the cause of a network problem in one of our buildings. We have one C5 and 6 A4-48 port switches there. C5 is set as router and all A4s are access switches and connected to C5 directly.
The problem is; on 4 of the A4 switches there is very high packet lost when I ping them. Also the clients connected to these switches can't connect to internet because of high packet lost and high latencey. When I ping the switch from an other building it shows ~50000 ms and ~%20-30 packet lost. If I disable all access ports on that switch ping times drop to 1-2 ms instantly. When I enable them the problem reappears.
I looked the logs on the switch and only saw 1-2 messages that should be abnormal:
TXQMONITO: txq_monitor.c(507) 2097 This is from manager 1 %% Tx queue for interface ge.1.50 is in stalled mode
ge.1.50 is the uplink port to C5. So I found this KB entry: https://gtacknowledge.extremenetworks.com/articles/Solution/Securestack-Not-passing-traffic-and-Mess...
And I disabled the flowcontrol as it suggests. So the problem seems resolved.
So why we encountered this problem? There is only 4-5 active clients per switch. And why we only saw this problem on the 4 of them but not all 6 A4H? Any one have an idea? We have over 30 A4H deployed and we never saw any problem like this?