Header Only - DO NOT REMOVE - Extreme Networks

Latency While Pinging Gateway of Vlan's Which is in our Core Switch

  • 5 January 2017
  • 19 replies

Userlevel 1
Hello, We have a BD8810 core switch and few models of Extreme EXOS Edge switches (like X430, X440, X460). All of our servers are connected to our core switch. Clients are connected in our Edge Switches. Our core switch having multiple VLAN's. Each vlan have own Tag ID. So, required VLAN's are created in our Edge Switches with relevant Tag ID in core switch. Now, the issue is, From past two days we are facing a issue while pinging the Gateway IP addresses of each VLAN's which are configured in our core. The issue is, reply time in ms is more than 150ms. It was less than 1ms previously. We don't know where the actual issue is? In configuration or is any loop or any slot problem in core. Please help us to resolve this issue.

19 replies

Userlevel 3
Did you check the CPU utilization of the core switch?
Userlevel 1
It could be a lot of things that could be causing the issue. Was there a change made 2 days ago? How are the edge switches connected to the core? What are the port configs on both ends of the connections (ie. port speed and duplex). Type the command "top", what is the output?
Userlevel 5
I had a similar issue crop up a few weeks ago. It was suggested that I upgrade my firmware. I did that, and after everything was upgraded and rebooted the issue went away for me.

Here is that thread: https://community.extremenetworks.com/extreme/topics/sudden-drop-in-speed-and-response-time-across-all-ssid-and-radios?utm_source=notification&utm_medium=email&utm_campaign=new_comment&utm_content=topic_link
Userlevel 6
Like Olaf said.. Ping response is CPU driven so doing the TOP command should give you insight. In the past I have seen a few things drive up the CPU on the 8900's we have. One is mac address churn where our tables had 100k or so mac's and apparently there were enough of the timing out and having to be re-learned it drove the cpu up... Increasing the timeout fixed that.
Next was snmp queries. We have multiple systems doing polling for bandwidth, management, port up or down on trunk ports ect and this drove up the cpu. This has never affected services passing through our systems but it does affect polling and response to pings...
You can use elrp client with the one-shot command to check for loops on various vlana that you have tagged to the edge.
Userlevel 5

To narrow down the issue, check for the following information:-
- Top ==> to see if there is any particular process is spiking up
- clear l2stats
- Execute "show l2stats"
- Run the above output for 3-4 times and figure out which vlans have large number of packets going to the cpu.

- Once you figure out the vlans which has large number of packets going to the CPU. Run ELRP on those vlans.
- Type enable elrp-client
Enables the Extreme Loop Recovery Protocol (ELRP) client (standalone ELRP) globally.
- Type configure elrp-client one-shot ports all print-and-log
Starts one-time, non-periodic ELRP packet transmission on the specified ports of the VLAN.

If any layer 2 loop is detected it would be printed in the logs, check for the physical connections.
Userlevel 5

What is the EXOS version running on the BD8K switch?

The below-mnetioned articles should be also helpful for you:-


Userlevel 1

Thanks for your reply. Please check the below image showing output of top command.

Userlevel 5

The "top" indicates bcmRX process is spiking. The bcmRX process handles traffic that is being received by the switches CPU.

In most situations this indicates a possible loop in the network. Please follow the steps which i have suggested earlier.
Userlevel 4
Most of cases I've seen was about looping but there was an interesting case which might be helpful for you.
One of my customer has a syslog server attached on a dist switch and the server was dead for some reason.
There were still lots of syslog packets coming from other servers and it hits the switch CPU.
It is because that ARP and FDB was not resolved on the switch so the packets were handled as slow path traffic.
The issue was gone after they replaced the syslog server.
We checked it by tcpdump in debug mode.
Userlevel 1
Sir, Sorry for the delayed response. Still we are facing the same issue. We tried to identify if loop persist in the network by using elrp client in all the vlans. But, no loop detected. But, the CPU utilization is very high. Around 80 numbers of Edge switches are connected in our core switch. How to check which Edge Switch cause High CPU utilization in core switch. Please help us to resolve this issue.
Userlevel 6

Considering that you have real time impact to the production environment, it would be better if you could open a case with GTAC so that we can assist you diagnose this through a remote session.

Looking at the output of top, the process fdb is also consuming a considerable amount of CPU.
So, we could suspect mac moves on the switch.

Please refer the article below for configuring the mac-tracking.


After configuring the mentioned commands. collect the output of " show log ".
If there is any mac move, it will be displayed in the log.

Hope this helps!
Userlevel 1

Thanks for the response.

Please check the below image which showing the Log of the core switch.

This logs showing Dos protect packet Exceeded logs repeatedly. How to identify where this traffic originates from.
Is there any option to identify where this traffic originates from?

Please help us to resolve this issue.
Userlevel 6

The article below explains the dos-protect messages:


I see that the threshold is set to a very minimal value of 150. Usually if the traffic has a pattern, (i.e from a specific source or to a specific destination), the same will be displayed in the log.

Only notify threshold log messages are seen anyways. If the alert threshold is reached, it could display the traffic pattern or it could say "No traffic pattern found".

if you need help with the packet capture, open a case with GTAC.

Hope this helps!
Userlevel 1
Thanks for the reply.
We ran the packet capture and export it to a TFTP Server. How to identify where is high the traffic originates from by using this captured data. We tried to open the captured file with wireshark. But we are not able identify the source which is sending more number of packets to CPU.

Thanks in advance.
Userlevel 1
Sir, We have analyze the Packet capture file with WireShark. From the analysis, we found that, Most of the entries shows with MDNS protocol along with the IP address and All these packets comes from the WiFi networks. Is this the source for high CPU utilization of our core switch?
Userlevel 6

The following articles might help!
Userlevel 1
Sir, Thanks. We are Planning to add ACL entry to block mDNS Packets. Where this ACL should be added? In Core switch? or all the edge switches? This will be very helpful to fix our issue.
Userlevel 2
Have you fix your problem Thavamani by blocking mDNS?