<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Stack switch fails and brings the whole LAN down in ExtremeSwitching (EXOS/Switch Engine)</title>
    <link>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20892#M1357</link>
    <description>i have  x440-48T Switch stacks in my Local LAN, over the last 6 months the LAN has been brought to a standstill because of a failure on a stack, both times it was a different stack. No one can connect from external the whole Network is down. We have LACP enabled on all stacks.&lt;BR /&gt;
&lt;BR /&gt;
The error what i see is this, anfter a reboot of the stack ist is good again but i dont understand why one failed Switch will bring the whole Network to a standstill.&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Min: 0.0 sec Avg: 0.868 sec Max: 1.10 sec&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Last Execution: 1.10 sec&lt;BR /&gt;
09/11/2017 09:06:45.35 &lt;I&gt; Slot-2: snmpMaster initialization complete&lt;BR /&gt;
09/11/2017 09:06:44.01 &lt;I&gt; Slot-2: **** telnetd started *****&lt;BR /&gt;
09/11/2017 09:06:41.60 &lt;I&gt; Slot-2: DOS protect application started successfully&lt;BR /&gt;
09/11/2017 09:06:41.49 &lt;I&gt; Slot-2: **** tftpd started *****&lt;BR /&gt;
09/11/2017 09:06:37.55 &lt;I&gt; Slot-2: snmpSubagent initialization complete&lt;BR /&gt;
09/11/2017 09:06:37.44 &lt;I&gt; Slot-2: Network Login framework has been initialized&lt;BR /&gt;
09/11/2017 09:06:34.83 &lt;DM.NOTICE&gt; Slot-2: Slot-2 being Powered ON&lt;BR /&gt;
09/11/2017 09:06:34.73 &lt;DM.NOTICE&gt; Slot-2: Node State[1] = INIT&lt;BR /&gt;
09/11/2017 09:06:34.15 &lt;I&gt; Slot-2: Hal initialization done.&lt;BR /&gt;
09/11/2017 09:06:33.67 &lt;I&gt; Slot-2: Module in Slot-2 is inserted&lt;BR /&gt;
09/11/2017 09:06:32.57 &lt;I&gt; Slot-2: Starting hal initialization ....&lt;BR /&gt;
09/11/2017 09:06:28.39 &lt;I&gt; Slot-2: telnetd listening on port 23&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:06:19.52 &lt;NM.STRTPROC&gt; Slot-2: The Node Manager (NM) has started processing.&lt;BR /&gt;
09/11/2017 09:06:19.14 &lt;DM.NOTICE&gt; Slot-2: DM started&lt;BR /&gt;
09/11/2017 09:06:18.44 &lt;EPM.START&gt; Slot-2: EPM Started&lt;BR /&gt;
09/11/2017 09:06:18.43 &lt;EPM.UNEXPCTREBOOTDTECT&gt; Slot-2: Booting after System Failure.&lt;BR /&gt;
09/11/2017 09:06:17.06 &lt;EPM.WD_WARM_RESET&gt; Slot-2: Changing to watchdog warm reset mode&lt;BR /&gt;
09/11/2017 06:35:15.24 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.21&lt;BR /&gt;
09/11/2017 06:35:15.19 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.20&lt;BR /&gt;
09/11/2017 06:17:14.95 &lt;EPM.ALL_SHUTDOWN&gt; Slot-2: Shutting down all processes&lt;BR /&gt;
09/11/2017 06:17:14.92 &lt;DM.WARNING&gt; Slot-2: Slot-2 FAILED (1) Backup lost&lt;BR /&gt;
09/11/2017 06:17:14.51 &lt;DM.WARNING&gt; Slot-1: BACKUP is NOT in SYNC&lt;BR /&gt;
09/11/2017 06:17:14.50 &lt;DM.WARNING&gt; Slot-1: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.ERROR&gt; Slot-2: Node State[4] = FAIL (Backup lost)&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: MASTER decided that I am not BACKUP anymore&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
05/15/2017 17:29:10.14 &lt;ELRP.DSBLPORTLOOPDTECT&gt; Slot-1: Disabling port 1:48. Auto re-enable port after 30 seconds&lt;BR /&gt;
05/15/2017 17:29:10.14 &lt;ELRP.DSBLPORTLOOPDTECT&gt; Slot-1: Disabling port 1:48. Auto re-enable port after 30 seconds&lt;BR /&gt;
&lt;BR /&gt;&lt;/ELRP.DSBLPORTLOOPDTECT&gt;&lt;/ELRP.DSBLPORTLOOPDTECT&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.ERROR&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/EPM.ALL_SHUTDOWN&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/EPM.WD_WARM_RESET&gt;&lt;/EPM.UNEXPCTREBOOTDTECT&gt;&lt;/EPM.START&gt;&lt;/DM.NOTICE&gt;&lt;/NM.STRTPROC&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/DM.NOTICE&gt;&lt;/DM.NOTICE&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
    <pubDate>Fri, 15 Sep 2017 13:14:00 GMT</pubDate>
    <dc:creator>Steven_Marriott</dc:creator>
    <dc:date>2017-09-15T13:14:00Z</dc:date>
    <item>
      <title>Stack switch fails and brings the whole LAN down</title>
      <link>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20892#M1357</link>
      <description>i have  x440-48T Switch stacks in my Local LAN, over the last 6 months the LAN has been brought to a standstill because of a failure on a stack, both times it was a different stack. No one can connect from external the whole Network is down. We have LACP enabled on all stacks.&lt;BR /&gt;
&lt;BR /&gt;
The error what i see is this, anfter a reboot of the stack ist is good again but i dont understand why one failed Switch will bring the whole Network to a standstill.&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Min: 0.0 sec Avg: 0.868 sec Max: 1.10 sec&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Last Execution: 1.10 sec&lt;BR /&gt;
09/11/2017 09:06:45.35 &lt;I&gt; Slot-2: snmpMaster initialization complete&lt;BR /&gt;
09/11/2017 09:06:44.01 &lt;I&gt; Slot-2: **** telnetd started *****&lt;BR /&gt;
09/11/2017 09:06:41.60 &lt;I&gt; Slot-2: DOS protect application started successfully&lt;BR /&gt;
09/11/2017 09:06:41.49 &lt;I&gt; Slot-2: **** tftpd started *****&lt;BR /&gt;
09/11/2017 09:06:37.55 &lt;I&gt; Slot-2: snmpSubagent initialization complete&lt;BR /&gt;
09/11/2017 09:06:37.44 &lt;I&gt; Slot-2: Network Login framework has been initialized&lt;BR /&gt;
09/11/2017 09:06:34.83 &lt;DM.NOTICE&gt; Slot-2: Slot-2 being Powered ON&lt;BR /&gt;
09/11/2017 09:06:34.73 &lt;DM.NOTICE&gt; Slot-2: Node State[1] = INIT&lt;BR /&gt;
09/11/2017 09:06:34.15 &lt;I&gt; Slot-2: Hal initialization done.&lt;BR /&gt;
09/11/2017 09:06:33.67 &lt;I&gt; Slot-2: Module in Slot-2 is inserted&lt;BR /&gt;
09/11/2017 09:06:32.57 &lt;I&gt; Slot-2: Starting hal initialization ....&lt;BR /&gt;
09/11/2017 09:06:28.39 &lt;I&gt; Slot-2: telnetd listening on port 23&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:06:19.52 &lt;NM.STRTPROC&gt; Slot-2: The Node Manager (NM) has started processing.&lt;BR /&gt;
09/11/2017 09:06:19.14 &lt;DM.NOTICE&gt; Slot-2: DM started&lt;BR /&gt;
09/11/2017 09:06:18.44 &lt;EPM.START&gt; Slot-2: EPM Started&lt;BR /&gt;
09/11/2017 09:06:18.43 &lt;EPM.UNEXPCTREBOOTDTECT&gt; Slot-2: Booting after System Failure.&lt;BR /&gt;
09/11/2017 09:06:17.06 &lt;EPM.WD_WARM_RESET&gt; Slot-2: Changing to watchdog warm reset mode&lt;BR /&gt;
09/11/2017 06:35:15.24 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.21&lt;BR /&gt;
09/11/2017 06:35:15.19 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.20&lt;BR /&gt;
09/11/2017 06:17:14.95 &lt;EPM.ALL_SHUTDOWN&gt; Slot-2: Shutting down all processes&lt;BR /&gt;
09/11/2017 06:17:14.92 &lt;DM.WARNING&gt; Slot-2: Slot-2 FAILED (1) Backup lost&lt;BR /&gt;
09/11/2017 06:17:14.51 &lt;DM.WARNING&gt; Slot-1: BACKUP is NOT in SYNC&lt;BR /&gt;
09/11/2017 06:17:14.50 &lt;DM.WARNING&gt; Slot-1: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.ERROR&gt; Slot-2: Node State[4] = FAIL (Backup lost)&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: MASTER decided that I am not BACKUP anymore&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
05/15/2017 17:29:10.14 &lt;ELRP.DSBLPORTLOOPDTECT&gt; Slot-1: Disabling port 1:48. Auto re-enable port after 30 seconds&lt;BR /&gt;
05/15/2017 17:29:10.14 &lt;ELRP.DSBLPORTLOOPDTECT&gt; Slot-1: Disabling port 1:48. Auto re-enable port after 30 seconds&lt;BR /&gt;
&lt;BR /&gt;&lt;/ELRP.DSBLPORTLOOPDTECT&gt;&lt;/ELRP.DSBLPORTLOOPDTECT&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.ERROR&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/EPM.ALL_SHUTDOWN&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/EPM.WD_WARM_RESET&gt;&lt;/EPM.UNEXPCTREBOOTDTECT&gt;&lt;/EPM.START&gt;&lt;/DM.NOTICE&gt;&lt;/NM.STRTPROC&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/DM.NOTICE&gt;&lt;/DM.NOTICE&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Fri, 15 Sep 2017 13:14:00 GMT</pubDate>
      <guid>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20892#M1357</guid>
      <dc:creator>Steven_Marriott</dc:creator>
      <dc:date>2017-09-15T13:14:00Z</dc:date>
    </item>
    <item>
      <title>RE: Stack switch fails and brings the whole LAN down</title>
      <link>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20893#M1358</link>
      <description>Hi Steven,&lt;BR /&gt;
&lt;BR /&gt;
Few things to consider there. &lt;BR /&gt;
&lt;BR /&gt;
The stack seems to have detected a loop and ELRP seems to have disabled the port#1:48 in the stack. &lt;BR /&gt;
&lt;BR /&gt;
05/15/2017 17:29:10.14 &lt;ELRP.DSBLPORTLOOPDTECT&gt; Slot-1: Disabling port 1:48. Auto re-enable port after 30 seconds&lt;BR /&gt;
&lt;BR /&gt;
This event is followed by the slot-2 reboot and then records the below log message,&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Min: 0.0 sec Avg: 0.868 sec Max: 1.10 sec&lt;BR /&gt;
09/11/2017 09:07:07.74 &lt;I&gt; Slot-2: perfTimer Execution time of Timer Thread select (38): Last Execution: 1.10 sec&lt;BR /&gt;
&lt;BR /&gt;
Please check this GTAC article for the above log message,&lt;BR /&gt;
&lt;BR /&gt;
&lt;A href="https://gtacknowledge.extremenetworks.com/articles/Q_A/What-are-perfTimer-Execution-messages" target="_blank" rel="nofollow noreferrer noopener"&gt;https://gtacknowledge.extremenetworks.com/articles/Q_A/What-are-perfTimer-Execution-messages&lt;/A&gt;&lt;BR /&gt;
&lt;BR /&gt;
And mentioning about the loop, it is a very high possibility that it could stop the switch from processing any traffic by flooding the switch CPU with huge volume of broadcasts and bring the switch to almost standstill. &lt;BR /&gt;
&lt;BR /&gt;
It is understood from the log message that the ELRP has taken down the loop by disabling the port. &lt;BR /&gt;
&lt;BR /&gt;
Has the ELRP PDU timer been changed by any chance ?&lt;BR /&gt;
&lt;BR /&gt;
Also, please check the output of "top" in the switch at the time of freeze using the console/serial cable to see if the CLI is functional and the process "bcmRx" or any other process seems to spike. &lt;BR /&gt;
 &lt;BR /&gt;
Thank You,&lt;BR /&gt;&lt;/I&gt;&lt;/I&gt;&lt;/ELRP.DSBLPORTLOOPDTECT&gt;</description>
      <pubDate>Fri, 15 Sep 2017 13:38:00 GMT</pubDate>
      <guid>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20893#M1358</guid>
      <dc:creator>Ariyakudi_Srini</dc:creator>
      <dc:date>2017-09-15T13:38:00Z</dc:date>
    </item>
    <item>
      <title>RE: Stack switch fails and brings the whole LAN down</title>
      <link>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20894#M1359</link>
      <description>sorry i copied earlier log Messages that have nothing to do with this issue, the logs for this issue start here.&lt;BR /&gt;
&lt;BR /&gt;
Like i said in my previous post why would this stack take all the Network down?&lt;BR /&gt;
&lt;BR /&gt;
09/11/2017 09:06:19.52 &lt;NM.STRTPROC&gt; Slot-2: The Node Manager (NM) has started processing.&lt;BR /&gt;
09/11/2017 09:06:19.14 &lt;DM.NOTICE&gt; Slot-2: DM started&lt;BR /&gt;
09/11/2017 09:06:18.44 &lt;EPM.START&gt; Slot-2: EPM Started&lt;BR /&gt;
09/11/2017 09:06:18.43 &lt;EPM.UNEXPCTREBOOTDTECT&gt; Slot-2: Booting after System Failure.&lt;BR /&gt;
09/11/2017 09:06:17.06 &lt;EPM.WD_WARM_RESET&gt; Slot-2: Changing to watchdog warm reset mode&lt;BR /&gt;
09/11/2017 06:35:15.24 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.21&lt;BR /&gt;
09/11/2017 06:35:15.19 &lt;NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt; Slot-2: Failed to send SNTP request to server 10.0.100.20&lt;BR /&gt;
09/11/2017 06:17:14.95 &lt;EPM.ALL_SHUTDOWN&gt; Slot-2: Shutting down all processes&lt;BR /&gt;
09/11/2017 06:17:14.92 &lt;DM.WARNING&gt; Slot-2: Slot-2 FAILED (1) Backup lost&lt;BR /&gt;
09/11/2017 06:17:14.51 &lt;DM.WARNING&gt; Slot-1: BACKUP is NOT in SYNC&lt;BR /&gt;
09/11/2017 06:17:14.50 &lt;DM.WARNING&gt; Slot-1: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.ERROR&gt; Slot-2: Node State[4] = FAIL (Backup lost)&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: MASTER decided that I am not BACKUP anymore&lt;BR /&gt;
09/11/2017 06:17:14.48 &lt;DM.WARNING&gt; Slot-2: BACKUP NODE (Slot-2) DOWN&lt;BR /&gt;
&lt;BR /&gt;
Thankyou for your quick response&lt;BR /&gt;
&lt;BR /&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.ERROR&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/DM.WARNING&gt;&lt;/EPM.ALL_SHUTDOWN&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/NETTOOLS.SNTP.TXREQTOSRVRFAIL&gt;&lt;/EPM.WD_WARM_RESET&gt;&lt;/EPM.UNEXPCTREBOOTDTECT&gt;&lt;/EPM.START&gt;&lt;/DM.NOTICE&gt;&lt;/NM.STRTPROC&gt;</description>
      <pubDate>Fri, 15 Sep 2017 13:53:00 GMT</pubDate>
      <guid>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20894#M1359</guid>
      <dc:creator>Steven_Marriott</dc:creator>
      <dc:date>2017-09-15T13:53:00Z</dc:date>
    </item>
    <item>
      <title>RE: Stack switch fails and brings the whole LAN down</title>
      <link>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20895#M1360</link>
      <description>Hi Steven,&lt;BR /&gt;
&lt;BR /&gt;
From the above available logs, there is nothing substantial to arrive at a conclusion for the stack freeze state.&lt;BR /&gt;
&lt;BR /&gt;
Did you have the opportunity to check the Master switch CLI over console? Does that also not respond? &lt;BR /&gt;
What is the EXOS version in the stack? &lt;BR /&gt;
&lt;BR /&gt;
And you have mentioned that the issue seems to resolve after a reboot, are you referring to a power cycle of the stack units or you are rebooting the stack over CLI? &lt;BR /&gt;
&lt;BR /&gt;
Thank You,</description>
      <pubDate>Fri, 15 Sep 2017 16:10:00 GMT</pubDate>
      <guid>https://community.extremenetworks.com/t5/extremeswitching-exos-switch/stack-switch-fails-and-brings-the-whole-lan-down/m-p/20895#M1360</guid>
      <dc:creator>Ariyakudi_Srini</dc:creator>
      <dc:date>2017-09-15T16:10:00Z</dc:date>
    </item>
  </channel>
</rss>

