Header Only - DO NOT REMOVE - Extreme Networks

S-Series Resets with bcmStrat "MEM_FAIL interrupt" error in Log


Userlevel 3
Article ID: 13306

Products
S-Series

Symptoms
A slot module has reset.

A "MEM_FAIL interrupt", "MEM_FAIL_INT_STAT", and/or "MEM_FAIL condition is fatal" error is seen in the message log ('show support').
Here are a number of examples from different systems, demonstrating various combinations seen (others are possible):

* "MEM_FAIL_INT_STAT=0x00000001":

Message 4/179 Syslog Message 07.21.02.0002 11/30/2012 14:55:36
<0>bcmStrat[3.tNimIntr]MEM_FAIL condition is fatal. ( 0x0093418c 0x009e8
868 0x004414b4 0x00441638 0x00f7bbd4 0xeeeeeeee )
Message 5/179 Syslog Message 07.21.02.0002 11/30/2012 14:55:36
<3>bcmStrat[3.tNimIntr]MEM_FAIL_INT_STAT=0x00000001, IP0_INTR_STATUS=000
0000000, IP1_INTR_STATUS=0000000000, IP2_INTR_STATUS=0000000000, IP3_INT
R_STATUS=0000000000, EP_INTR_STATUS=0000000000[/code]
* "MEM_FAIL_INT_STAT=0x00000001"+"IP1_INTR_STATUS=0x00000200": See 14830.

* "MEM_FAIL_INT_STAT=0x00000001"+"IP1_INTR_STATUS=0x00200000":

Message 6/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<0>bcmStrat[3.tNimIntr]MEM_FAIL condition is fatal. ( 0x0090d518 0x009bd
0ac 0x00441140 0x004412c4 0x00f49254 0xeeeeeeee )
Message 7/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<3>bcmStrat[3.tNimIntr]MEM_FAIL_INT_STAT=0x00000001, IP0_INTR_STATUS=000
0000000, IP1_INTR_STATUS=0x00200000, IP2_INTR_STATUS=0000000000, IP3_INT
R_STATUS=0000000000, EP_INTR_STATUS=0000000000
Message 8/273 Syslog Message 07.11.02.0003 09/10/2012 09:20:52
<3>bcmStrat[3.tNimIntr]MEM_FAIL interrupt occurred on chip 12![/code]
* "MEM_FAIL_INT_STAT=0x00000200": See 14830.

* "MEM_FAIL_INT_STAT=0x00010000":

Message 5/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<0>bcmStrat[2.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009cca7
0 0x00a81428 0x00a86240 0x00a8b0d0 0x00a83d34 0x00447ce4 0x00449c08 0x01
134eb4 0xeeeeeeee )
Message 6/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00010000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/76 Syslog Message 07.31.04.0002 08/05/2012 07:49:46
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 8![/code]
* "MEM_FAIL_INT_STAT=0x00040000":

Message 5/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<0>bcmStrat[2.tNimIntr]Chip 4 has encountered a fatal error. ( 0x00a8afc
4 0x00b45990 0x00b4a80c 0x00b4fc24 0x00b4829c 0x004529dc 0x00454c24 0x01
1fbff4 0xeeeeeeee )
Message 6/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00040000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/157 Syslog Message 07.71.01.0010 11/01/2012 01:17:48
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 4![/code]
* "EP_INTR_STATUS=0x00000080":

Message 5/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<0>bcmStrat[1.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009cc9a
8 0x00a81224 0x00a8603c 0x00a8aecc 0x00a83b30 0x00447ce4 0x00449c08 0x01
134d14 0xeeeeeeee )
Message 6/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000080, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/201 Syslog Message 07.31.03.0010 07/18/2012 06:35:49
<3>bcmStrat[1.tNimIntr]MEM_FAIL interrupt occurred on chip 8![/code]
* "IP0_INTR_STATUS=0x00000001":

Message 5/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<0>bcmStrat[4.tNimIntr]Chip 8 has encountered a fatal error. ( 0x00a8b68
8 0x00b4607c 0x00b4aef8 0x00b50310 0x00b48988 0x004529e8 0x00454c30 0x01
1fc6d4 0xeeeeeeee )
Message 6/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<3>bcmStrat[4.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000001, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/94 Syslog Message 07.71.02.0005 08/14/2012 00:22:03
<3>bcmStrat[4.tNimIntr]MEM_FAIL interrupt occurred on chip 8![/code]
* "IP1_INTR_STATUS=0x00000010":

Message 15/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<0>bcmStrat[2.tNimIntr]Chip 4 has encountered a fatal error. ( 0x009cca7
0 0x00a81428 0x00a86240 0x00a8b0d0 0x00a83d34 0x00447ce4 0x00449c08 0x01
134eb4 0xeeeeeeee )
Message 16/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000010, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 17/120 Syslog Message 07.31.04.0002 09/19/2012 08:06:51
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 4![/code]
* "IP1_INTR_STATUS=0x00000040":

Message 5/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<0>bcmStrat[2.tNimIntr]Chip 12 has encountered a fatal error. ( 0x009cc9
a8 0x00a81224 0x00a8603c 0x00a8aecc 0x00a83b30 0x00447ce4 0x00449c08 0x0
1134d14 0xeeeeeeee )
Message 6/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<3>bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000040, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000
Message 7/78 Syslog Message 07.31.03.0010 11/22/2011 12:27:32
<3>bcmStrat[2.tNimIntr]MEM_FAIL interrupt occurred on chip 12![/code]
* "IP3_INTR_STATUS=0x00000002":

Message 7/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<0>bcmStrat[1.tNimIntr]Chip 8 has encountered a fatal error. ( 0x009deba
0 0x00a9d35c 0x00aa2174 0x00aa7004 0x00a9fc68 0x0044be38 0x0044de04 0x01
1541b4 0xeeeeeeee )
Message 8/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000002
Message 9/47 Syslog Message 07.41.02.0014 06/18/2012 16:08:40
<3>bcmStrat[1.tNimIntr]MEM_FAIL interrupt occurred on chip 8![/code]
All statuses zero:

Message 10/57 Syslog Message 07.41.02.0014 01/07/2012 17:32:34
<0>bcmStrat[1.tNimIntr]Chip 0 has encountered a fatal error. ( 0x009deba
0 0x00a9d35c 0x00aa2174 0x00aa7004 0x00a9fc68 0x0044be38 0x0044de04 0x01
1541b4 0xeeeeeeee )
Message 11/57 Syslog Message 07.41.02.0014 01/07/2012 17:32:34
<3>bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00
000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR
_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000[/code]
Just "MEM_FAIL interrupt occurred":

Message 5/42 Syslog Message 07.11.01.0026 09/20/2012 04:41:36
<0>bcmStrat[2.tNimIntr]Fatal error: MEM_FAIL interrupt occurred on chip
0. ( 0x0090c368 0x009bb92c 0x0044100c 0x00441190 0x00f47254 0xeeeeeeee )[/code]
Solution
For "MEM_FAIL interrupt" events which exactly match the status values in 14830, please use that article instead.

Otherwise, upgrade to firmware 8.21.02.0001 or higher.

Firmware 7.21.02.0002 release notes state, in the 'Problems Corrected in 7.21.02.0002' section:
If ports are receiving traffic while their MAC chip is being initialized the MAC may mishandle these packets and cause errors similar to "1. Fuji RX MAIN intr: Fuji=8, Adr=0, Reg=0x00200040", "2. Fuji MAC MAIN intr: Fuji=0, Adr=0, Reg=0x00080000", "3. MEM_FAIL interrupt occurred on chip" (which resets blade), or one of several bcmstrat memory system parity errors sometimes reported as "bcmStrat[3.bcmDPC]unit 0 (null) asserted ". The "parity" and "null" messages are displayed constantly and result in high CPU utilization.

Firmware 8.11.03.0005 release notes state, in the 'Platform Problems Corrected in 8.11.03.0005' section:
System logs the message "bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000010, IP2_INTR_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[5.tNimIntr]MEM_FAIL_INT_STAT=0x00000000, EP_INTR_STATUS=0x00000080, IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000, IP3_INTR_STATUS=0x00000000" and resets.

Firmware 8.11.04.0005 release notes state, in the 'Platform Problems Corrected in 8.11.04.0005' section:
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00200000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000001,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[2.tNimIntr]MEM_FAIL_INT_STAT=0x00040000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000000" and resets.
-and-
System logs the message "bcmStrat[1.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EP_INTR_STATUS=0x00000000, IP0_INTR_STATUS=0x00000000,
IP1_INTR_STATUS=0x00000000, IP2_INTR_STATUS=0x00000000,
IP3_INTR_STATUS=0x00000002" and resets.

Firmware 8.21.02.0001 release notes do not list this "all statuses zero" item, though it is corrected in this release:
System logs the message "bcmStrat[4.tNimIntr]MEM_FAIL_INT_STAT=0x00000000,
EGR_INTR0_STATUS=0x00000000, EGR_INTR1_STATUS=0x00000000,
IP0_INTR_STATUS=0x00000000, IP1_INTR_STATUS=0x00000000,
IP2_INTR_STATUS=0x00000000, IP2_INTR_STATUS_2=0x00000000,
IP3_INTR_STATUS=0x00000000, IP4_INTR_STATUS=0x00000000,
IP5_INTR_STATUS=0x00000000, IP5_INTR_STATUS_1=0x00000000,
IP5_INTR_STATUS_2=0x00000000" and resets.

If the issue persists after upgrade, please contact the GTAC for assistance.

0 replies

Be the first to reply!

Reply