cancel
Showing results for 
Search instead for 
Did you mean: 

S-Series Resets with bcmStrat "MEM_FAIL interrupt" error in Log

S-Series Resets with bcmStrat "MEM_FAIL interrupt" error in Log

FAQ_User
Extreme Employee
Article ID: 13306

Products
S-Series

Symptoms
A slot module has reset.

A "MEM_FAIL interrupt", "MEM_FAIL_INT_STAT", and/or "MEM_FAIL condition is fatal" error is seen in the message log ('show support').
Here are a number of examples from different systems, demonstrating various combinations seen (others are possible):

* "MEM_FAIL_INT_STAT=0x00000001":

Message 4/179 Syslog Message 07.21.02.0002 11/30/2012 14:55:36
<0>bcmStrat[3.tNimIntr]MEM_FAIL condition is fatal. ( 0x0093418c 0x009e8
868 0x004414b4 0x00441638 0x00f7bbd4 0xeeeeeeee )
Message 5/179 Syslog Message 07.21.02.0002 11/30/2012 14:55:36
<3>bcmStrat[3.tNimIntr]MEM_FAIL_INT_STAT=0x00000001, IP0_INTR_STATUS=000
0000000, IP1_INTR_STATUS=0000000000, IP2_INTR_STATUS=0000000000, IP3_INT
R_STATUS=0000000000, EP_INTR_STATUS=0000000000
* "MEM_FAIL_INT_STAT=0x00000001"+"IP1_INTR_STATUS=0x00000200": See 14830.

* "MEM_FAIL_INT_STAT=0x00000001"+"IP1_INTR_STATUS=0x00200000":

Message 6/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<0>bcmStrat[3.tNimIntr]MEM_FAIL condition is fatal. ( 0x0090d518 0x009bd
0ac 0x00441140 0x004412c4 0x00f49254 0xeeeeeeee )
Message 7/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<3>bcmStrat[3.tNimIntr]MEM_FAIL_INT_STAT=0x00000001, IP0_INTR_STATUS=000
0000000, IP1_INTR_STATUS=0x00200000, IP2_INTR_STATUS=0000000000, IP3_INT
R_STATUS=0000000000, EP_INTR_STATUS=0000000000
Message 8/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<3>bcmStrat[3.tNimIntr]MEM_FAIL interrupt occurred on chip 12!
* "MEM_FAIL_INT_STAT=0x00000200": See 14830.

* "MEM_FAIL_INT_STAT=0x00010000":

Message 5/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<0>bcmStrat[2.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009cca7
0 0x00a81428 0x00a86240 0x00a8b0d0 0x00a83d34 0x00447ce4 0x00449c08 0x01
134eb4 0xeeeeeeee )
Message 6/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00010000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 8!
* "MEM_FAIL_INT_STAT=0x00040000":

Message 5/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<0>bcmStrat[2.tNimIntr]Chip 4 has encountered a fatal error. ( 0x00a8afc
4 0x00b45990 0x00b4a80c 0x00b4fc24 0x00b4829c 0x004529dc 0x00454c24 0x01
1fbff4 0xeeeeeeee )
Message 6/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00040000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 4!
* "EP_INTR_STATUS=0x00000080":

Message 5/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<0>bcmStrat[1.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009cc9a
8 0x00a81224 0x00a8603c 0x00a8aecc 0x00a83b30 0x00447ce4 0x00449c08 0x01
134d14 0xeeeeeeee )
Message 6/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000080, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<3>bcmStrat[1.tNimIntr]MEM_FAIL interrupt occurred on chip 8!
* "IP0_INTR_STATUS=0x00000001":

Message 5/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<0>bcmStrat[4.tNimIntr]Chip 8 has encountered a fatal error. ( 0x00a8b68
8 0x00b4607c 0x00b4aef8 0x00b50310 0x00b48988 0x004529e8 0x00454c30 0x01
1fc6d4 0xeeeeeeee )
Message 6/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<3>bcmStrat[4.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000001, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<3>bcmStrat[4.tNimIntr]MEM_FAIL interrupt occurred on chip 8!
* "IP1_INTR_STATUS=0x00000010":

Message 15/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<0>bcmStrat[2.tNimIntr]Chip 4 has encountered a fatal error. ( 0x009cca7
0 0x00a81428 0x00a86240 0x00a8b0d0 0x00a83d34 0x00447ce4 0x00449c08 0x01
134eb4 0xeeeeeeee )
Message 16/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000010, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 17/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 4!
* "IP1_INTR_STATUS=0x00000040":

Message 5/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<0>bcmStrat[2.tNimIntr]Chip 12 has encountered a fatal error. ( 0x009cc9
a8 0x00a81224 0x00a8603c 0x00a8aecc 0x00a83b30 0x00447ce4 0x00449c08 0x0
1134d14 0xeeeeeeee )
Message 6/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000040, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 12!
* "IP3_INTR_STATUS=0x00000002":

Message 7/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<0>bcmStrat[1.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009deba
0 0x00a9d35c 0x00aa2174 0x00aa7004 0x00a9fc68 0x0044be38 0x0044de04 0x01
1541b4 0xeeeeeeee )
Message 8/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000002
Message 9/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<3>bcmStrat[1.tNimIntr]MEM_FAIL interrupt occurred on chip 8!
All statuses zero:

Message 10/57 Syslog Message 07.41.02.0014 01/07/2012 17:32:34
<0>bcmStrat[1.tNimIntr]Chip 0 has encountered a fatal error. ( 0x009deba
0 0x00a9d35c 0x00aa2174 0x00aa7004 0x00a9fc68 0x0044be38 0x0044de04 0x01
1541b4 0xeeeeeeee )
Message 11/57 Syslog Message 07.41.02.0014 01/07/2012 17:32:34
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Just "MEM_FAIL interrupt occurred":

Message 5/42 Syslog Message 07.11.01.0026 09/20/2012 04:41:36
<0>bcmStrat[2.tNimIntr]Fatal error: MEM_FAIL interrupt occurred on chip
0. ( 0x0090c368 0x009bb92c 0x0044100c 0x00441190 0x00f47254 0xeeeeeeee )
Solution
For "MEM_FAIL interrupt" events which exactly match the status values in 14830, please use that article instead.

Otherwise, upgrade to firmware 8.21.02.0001 or higher.

Firmware 7.21.02.0002 release notes state, in the 'Problems Corrected in 7.21.02.0002' section:
If ports are receiving traffic while their MAC chip is being initialized the MAC may mishandle these packets and cause errors similar to "1. Fuji RX MAIN intr: Fuji=8, Adr=0, Reg=0x00200040", "2. Fuji MAC MAIN intr: Fuji=0, Adr=0, Reg=0x00080000", "3. MEM_FAIL interrupt occurred on chip" (which resets blade), or one of several bcmstrat memory system parity errors sometimes reported as "bcmStrat[3.bcmDPC]unit 0 (null) asserted ". The "parity" and "null" messages are displayed constantly and result in high CPU utilization.

Firmware 8.11.03.0005 release notes state, in the 'Platform Problems Corrected in 8.11.03.0005' section:
System logs the message "bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000010, IP2_INTR_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[5.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00000080, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000" and resets.

Firmware 8.11.04.0005 release notes state, in the 'Platform Problems Corrected in 8.11.04.0005' section:
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00200000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000001,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00040000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000002" and resets.

Firmware 8.21.02.0001 release notes do not list this "all statuses zero" item, though it is corrected in this release:
System logs the message "bcmStrat[4.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EGR_INTR0_STATUS=0x00000000, EGR_INTR1_STATUS=0x00000000,
IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000,
IP2_INTR_STATUS=0x00000000, IP2_INTR_STATUS_2=0x00000000,
IP3_INTR_STATUS=0x00000000, IP4_INTR_STATUS=0x00000000,
IP5_INTR_STATUS=0x00000000, IP5_INTR_STATUS_1=0x00000000,
IP5_INTR_STATUS_2=0x00000000" and resets.

If the issue persists after upgrade, please contact the GTAC for assistance.
1 REPLY 1

amit_ramani
New Contributor

Hello Team Extreme,

Myself from Invensys Foxboro by Schneider Electric Systems India Pvt. Ltd.,

We have observed Memory Chip 8 Fatal Error for SSA180 / SSA-G8018-0652 with Firmware 08.32.02.0008 Switch.

Found once restarted the respective Switch. Below the sample Support Logs.

 

 

   Current Time:      12/16/2020 16:48:38
   System Uptime:     003 days, 17 hours, 57 minutes, 35 seconds

==============================================================================
Message   1/238  Syslog Message            08.32.02.0008   12/12/2020 22:53:17
    <3>System[1]Ambient air temperature is normal (26 C)
==============================================================================
Message   2/238  System Init               08.32.02.0008   12/12/2020 22:52:03

   The software system has successfully completed initialization
   and is entering its normal operational mode.
==============================================================================
Message   3/238  Informational             08.32.02.0008   12/12/2020 22:51:01

    Device was last fully operational in user mode: 42 seconds ago. Last res
    et caused by: user action.
==============================================================================
Message   4/238  Shutdown                  08.32.02.0008   12/12/2020 22:50:38

  Completed - Reset           
  Shutdown process ends.  System will be reset. 

==============================================================================
Message   5/238  Shutdown                  08.32.02.0008   12/12/2020 22:50:33

  Initiated - Reset           
  Shutdown process is starting. 

==============================================================================
Message   6/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <0>bcmStrat[1.bcmDPC]Chip 8 has encountered a fatal error. ( 0x00e8a570     0x00f6437c 0x00f6b6bc 0x00f5d814 0x01153664 0x011fc6c0 0x011fd2bc 0x0113    fdec 0x01af2924 0xeeeeeeee )
==============================================================================
Message   7/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.bcmDPC]FATAL memory error detected on chip 8 (arg2=0xa1c,     arg3=0x72f2)
==============================================================================
Message   8/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.bcmDPC]Unit 8: CLEAR_RESTORE: L2X[2588] blk: ipipe0 index:     32766 : [0][0]
==============================================================================
Message   9/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.bcmDPC]Unit 8: mem: 2588=L2X blkoffset:8
==============================================================================
Message  10/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.bcmDPC]unit 8 L2X entry 32766 parity error
==============================================================================
Message  11/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, IP0_INTR_STATUS=0x0    0000000, IP1_INTR_STATUS=0x00000090, IP2_INTR_STATUS=0x00000000, IP3_INT    R_STATUS=0x00000000, IP3_INTR_STATUS_1=0x00000000, IP3_INTR_STATUS_2=0x0    0000000, EP_INTR0_STATUS=0x00000000, EP_INTR1_STATUS=0x00000000, GPORT_I    NTR_STATUS0=0x00000000, GPORT_INTR_STATUS1=0x00000000, XPORT_INTR_STATUS    0=0x00000000, XPORT_INTR_STATUS1=0x00000000, XPORT_INTR_STATUS2=0x000000    00, XPORT_INTR_STATUS3=0x00000000, XQPORT_INTR_STATUS0=0x00000000, XQPOR    T_INTR_STATUS1=0x00000000, XQPORT_INTR_STATUS2=0x00000000, XQPORT_INTR_S    TATUS3=0x00000000, XQPORT_INTR_STATUS4=0x00000000, XQPORT_INTR_STATUS5=0    x00000000, SPORT_INTR_STATUS=0x00000000
==============================================================================
Message  12/238  Syslog Message            08.32.02.0008   12/12/2020 22:50:33
    <3>bcmStrat[1.tNimIntr]MEM_FAIL interrupt occurred on chip 8!
==============================================================================
Message  13/238  Syslog Message            08.32.02.0008   11/06/2020 12:56:51
    <3>System[1]Ambient air temperature is normal (32 C)
==============================================================================
Message  14/238  Syslog Message            08.32.02.0008   11/06/2020 11:34:11
    <3>System[1]Ambient air temperature is warm (35 C)
==============================================================================
Message  15/238  Syslog Message            08.32.02.0008   07/18/2018 10:56:45
    <3>System[1]Ambient air temperature is normal (32 C)
==============================================================================
 

 

To avoid further downtime, we as of now replacing this switch with Spare SSA Hardware + Firmware 08.63.07.0003 based on Clearance.

Updated you the same for future SSA Hardware Functional Improvement Reference Purpose.

 

 

 

Thanks & Regards,

Amit Ramani

Schneider Electric Systems India,

+919625226475

 

 

 

GTM-P2G8KFN