C5-Series f/w 6.42.12.0007 Reset from C5IntProc task after DMA Errors logged

  • 0
  • 1
  • Article
  • Updated 5 years ago
  • (Edited)
Article ID: 14739 

Products
C5-Series; firmware 6.42.10.0016 through 6.61.12.0005, 6.71.01.0067 through 6.71.04.0004, 6.81.01.0027

Symptoms
DMA-type errors display in the current.log, followed by a unit reboot event. 

The current.log (5487) displays DMA-type errors (14007); for example:
<57> FEB 26 15:51:56 2012 STK1 BOOT[536866080]: bootos.c(714) 29 %% Start of Code - Build:06.42.10.0016 Date:Mon Dec 12 13:52:20 2011
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4433) 250 %% Unit 1 DMA regs:PCIMEM_START(0x04ed3040) SBUS_START(0x07a02000) ENTRY_CNT(0x00001000) CFG(0x0004011c) SBUS_ADDR(0x07a02000) CMIC_SCHAN_CTRL(0x00200000) CMIC_DMA_STAT(0x00082012) CMIC_IRQ_STAT(0x60000100) rv(0xfffffff5) LINE(2985)
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4434) 251 %% Unit 1 DMA regs:PCIMEM_START(0x04ed3040) SBUS_START(0x07a02000) ENTRY_CNT(0x00001000) CFG(0x0004011c) SBUS_ADDR(0x07a02000) CMIC_SCHAN_CTRL(0x00200000) CMIC_DMA_STAT(0x00082012) CMIC_IRQ_STAT(0x60000100) rv(0xfffffff5) LINE(2985)
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4454) 252 %% PCI Status for CPU=0x20a0
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4448) 254 %% PCI Status for Device 0x14e4:0xb626=0x02a0
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4461) 258 %% MPC85xx DMA/PCI register dump
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4480) 260 %% DGSR(0x00000000) ERR_DR(0x80000040) ERR_ATTRIB(0x001fa001) ERR_ADDR(0x00000000) ERR_EXT_ADDR(0x00000000) ERR_DL(0x00000000) ERR_DH(0x00000000)
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: hwutils.c(4496) 262 %% 1:PEX_ERR_DR(0x00000000) PEX_ERR_CAP_STAT(0x00000000) PEX_ERR_CAP_R0(0x00000000) PEX_ERR_CAP_R1(0x00000000) PEX_ERR_CAP_R2(0x00000000) PEX_ERR_CAP_R3(0x00000000)
<160>Jul 15 06:13:24 10.3.0.202-1 SIM[82019840]: broad_hpc_drv.c(2689) 268 %% _soc_xgs3_mem_dma: L2_ENTRY.ipipe0 failed(NAK), unit 1
<160>Jul 15 06:13:26 10.3.0.202-1 SIM[82019840]: broad_hpc_drv.c(2689) 345 %% soc_l2x_thread: Too many errors
<160>Jul 15 06:13:26 10.3.0.202-1 DRIVER[82019840]: hwutils.c(4100) 346 %% soc_l2x_thread unit = 1: DMA failed too many times
<160>Jul 15 06:13:26 10.3.0.202-1 SIM[82019840]: hwutils.c(4101) 347 %% soc_l2x_thread unit = 1: DMA failed too many times
With firmware 6.51.02.0018 and lower... 
The current.log (5487) then displays a "l7_prepareSystemDump" BackTrace with a "hwutils.c" reset cause; for example:
    <57> JUL 15 06:14:39 2012 STK1 BOOT[536866080]: bootos.c(714) 183 %%
    Start of Code - Build:06.42.10.0016 Date:Mon Dec 12 13:52:20 2011
    BackTrace-0x0082cc28: sysReboot (0x82cc28) + 0x0
    BackTrace-0x010e848c: SwitchReset (0x10e83f4) + 0x98
    BackTrace-0x00c2373c: l7_prepareSystemDump (0xc22c44) + 0xaf8
    BackTrace-0x00c23ce0: l7MonitorTask (0xc23c3c) + 0xa4
    BackTrace-0x011fd960: vxTaskEntry (0x11fd904) + 0x5c
    BackTrace-r0 = 0x00000000 r1 = 0x00000000 r2 = 0x00000000
    BackTrace-r3 = 0x00000000 r4 = 0x00000000 r5 = 0x00000000
    BackTrace-r6 = 0x00000000 r7 = 0x00000000 r8 = 0x00000000
    BackTrace-r9 = 0x00000000 r10 = 0x00000000 r11 = 0x00000000
    BackTrace-r12 = 0x00000000 r13 = 0x00000000 r14 = 0x00000000
    BackTrace-r15 = 0x00000000 r16 = 0x00000000 r17 = 0x00000000
    BackTrace-r18 = 0x00000000 r19 = 0x00000000 r20 = 0x00000000
    BackTrace-r21 = 0x00000000 r22 = 0x00000000 r23 = 0x00000000
    BackTrace-r21 = 0x00000000 r22 = 0x00000000 r23 = 0x00000000
    BackTrace-r24 = 0x00000000 r25 = 0x00000000 r26 = 0x00000000
    BackTrace-r27 = 0x00000000 r28 = 0x00000000 r29 = 0x00000000
    BackTrace-r30 = 0x00000000 r31 = 0x00000000
    BackTrace-lr = 0x00000000 pc = 0x00000000 msr = 0x00000000
    <110> JUL 15 06:14:58 2012 STK1 BOOT[211235248]: edb_bxs.c(1226) 202 %%
    Last switch reset caused by hwutils.c(3046): Error code 0x6c327800, after 12071860 second
    With firmware 6.61.02.0007 and higher... 
    The current.log (5487) then displays a "Task C5IntProc(<address>) is suspended..." message; for example:
      Task C5IntProc(0xa8ab2d0) is suspended with error 2, creating file sysDmp3Jun0812.z
      <57> JUN 08 04:17:17 2012 STK1 BOOT[536866080]: bootos.c(722) 147 %%
      Start of Code - Build:06.61.02.0007 Date:Wed Mar 28 19:45:16 2012
      <110> JUN 08 04:18:23 2012 STK1 BOOT[217163184]: edb_bxs.c(1222) 149 %%
      Last switch reset caused by hwutils.c(3110): Error code 0x6c327800, after 3435579 second
      The sysDmp (13650) file states "Task Name: C5IntProc" and the diagnostic points to either "IntProcessTask" or "_vx_offset_COPROC_DESC_next"; for example: 

      Example 1
      Detailed erred task information
      ---------------------------------
      Calling Stack:

      ------------
      Task ID: 0xa4563b0
      Task Name: C5IntProc
      PC: 0x12b4ce8
      PendQ: 0xa445d0c
      SP: 0xa455cc0

      0x12b4ce8: taskSuspend (0x12b48ac) + 0x43c
      0x10de194: log_error_nvram (0x10dded8) + 0x2bc
      0x10e3170: IntProcessTask (0x10e2654) + 0xb1c
      0x11fd960: vxTaskEntry (0x11fd904) + 0x5c

      Example 2
      Detailed exception task information
      ---------------------------------

      Calling Stack:
      --------------
      Task ID: 0xa90dc20
      Task Name: C5IntProc
      PC: 0x0
      PendQ: 0x13d87a44
      SP: 0x0

      0x0: _vx_offset_COPROC_DESC_next (0x0) + 0x0
      Solution
      Please see 16165 for firmware upgrade recommendations.
      Photo of FAQ User

      FAQ User, Official Rep

      • 13,610 Points 10k badge 2x thumb

      Posted 5 years ago

      • 0
      • 1

      There are no replies.

      This conversation is no longer open for comments or replies.