B5-Series f/w 6.42.12.0007 Reset from IntProc task after DMA Errors logged

  • 0
  • 1
  • Article
  • Updated 5 years ago
  • (Edited)
Article ID: 14793 

Products
B5-Series; firmware 6.42.10.0016 through 6.61.12.0005, 6.71.01.0067 through 6.71.04.0004, 6.81.01.0027

Symptoms
DMA-type errors display in the current.log, followed by a unit reboot event. 

The current.log (5487) displays DMA-type errors (14007); for example:
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3910) 239 %% Unit 0 DMA regs:PCIMEM_START(0x60000100) SBUS_START(0x04728828) ENTRY_CNT(0x06702000) CFG(0x00001000) SBUS_ADDR(0x0003000c) CMIC_DMA_STAT(0x00000000) CMIC_IRQ_STAT(0x00082012) rv(0x06702000) LINE(-11)
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3911) 240 %% Unit 0 DMA regs:PCIMEM_START(0x60000100) SBUS_START(0x04728828) ENTRY_CNT(0x06702000) CFG(0x00001000) SBUS_ADDR(0x0003000c) CMIC_DMA_STAT(0x00000000) CMIC_IRQ_STAT(0x00082012) rv(0x06702000) LINE(-11)
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3926) 241 %% MPC83xx DMA/PCI regs: DMAGSR(0x00000000) PCI_ESR(0x80000040) PCI_EATCR(0x3f10a001) PCI_EACR(0x00000000) PCI_EEACR(0x00000000) PCI_EDCR(0x00000000)
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3940) 242 %% PCI Status for Device 0x14e4:0xb514=0x02a0
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3941) 243 %% PCI Status for Device 0x14e4:0xb514=0x02a0
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3940) 244 %% PCI Status for Device 0x14e4:0xb514=0x02a0
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3941) 245 %% PCI Status for Device 0x14e4:0xb514=0x02a0
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: broad_hpc_drv.c(2689) 246 %% _soc_xgs3_mem_dma: L2_ENTRY.ipipe0 failed(NAK), unit 0
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: broad_hpc_drv.c(2689) 247 %% soc_l2x_thread: Too many errors
<160>Aug 24 02:20:45 10.1.1.2-2 DRIVER[74138864]: hwutils.c(3705) 248 %% soc_l2x_thread unit = 0: DMA failed too many times
<160>Aug 24 02:20:45 10.1.1.2-2 SIM[74138864]: hwutils.c(3706) 249 %% soc_l2x_thread unit = 0: DMA failed too many times
With firmware 6.51.02.0018 and lower... 
The current.log (5487) then displays a "l7_prepareSystemDump" BackTrace with a "hwutils.c" reset cause; for example:
    <160>Aug 24 02:20:55 10.1.1.2-2 SIM[143763376]: hwutils.c(2460) 250 %%
    Error(0x6c327800)
    <160>Aug 24 02:20:56 10.1.1.2-2 SIM[142455312]: hwutils.c(4099) 251 %%
    Code exception: 1 minute before watchdog will no longer be petted.
    <57> AUG 24 02:21:41 2012 STK1 BOOT[536870176]: bootos.c(567) 195 %%
    Start of Code - Build:06.42.12.0007 Date:Thu Apr 26 16:11:16 2012
    BackTrace-0x00770f10: sysReboot (0x770f10) + 0x0
    BackTrace-0x01020630: SwitchReset (0x1020558) + 0xd8
    BackTrace-0x00b62740: l7_prepareSystemDump (0xb61bf0) + 0xb50
    BackTrace-0x00b62d20: l7MonitorTask (0xb62c48) + 0xd8
    BackTrace-0x01137f30: vxTaskEntry (0x1137ed4) + 0x5c
    BackTrace-r0 = 0x00000000 r1 = 0x00000000 r2 = 0x00000000 r3 = 0x00000000
    BackTrace-r4 = 0x00000000 r5 = 0x00000000 r6 = 0x00000000 r7 = 0x00000000
    BackTrace-r8 = 0x00000000 r9 = 0x00000000 r10 = 0x00000000 r11 = 0x00000000
    BackTrace-r12 = 0x00000000 r13 = 0x00000000 r14 = 0x00000000 r15 = 0x00000000
    BackTrace-r16 = 0x00000000 r17 = 0x00000000 r18 = 0x00000000 r19 = 0x00000000
    BackTrace-r20 = 0x00000000 r21 = 0x00000000 r22 = 0x00000000 r23 = 0x00000000
    BackTrace-r24 = 0x00000000 r25 = 0x00000000 r26 = 0x00000000 r27 = 0x00000000
    BackTrace-r28 = 0x00000000 r29 = 0x00000000 r30 = 0x00000000 r31 = 0x00000000
    BackTrace-lr = 0x00000000 pc = 0x00000000 msr = 0x00000000
    <110> AUG 24 02:22:01 2012 STK1 BOOT[174300888]: edb_bxs.c(1226) 210 %%
    Last switch reset caused by hwutils.c(2460): Error code 0x6c327800, after 3322081 seconds
    With firmware 6.61.02.0007 and higher... 
    The current.log (5487) then displays a "Task IntProc(<address>) is suspended..." message; for example:
      Task IntProc(0x80226a0) is suspended with error 2, creating file sysDmp2Aug2412.z
      <57> AUG 24 02:20:59 2012 STK1 BOOT[536870176]: bootos.c(571) 363 %%
      Start of Code - Build:06.61.05.0007 Date:Fri May 25 22:59:22 2012
      <110> AUG 24 02:22:01 2012 STK1 BOOT[165052800]: edb_bxs_api.c(778) 365 %%
      Last switch reset caused by hwutils.c(2493): Error code 0x6c327800, after 76 seconds
      The sysDmp (13650) file states "Task Name: IntProc" and the diagnostic points to either "IntProcessTask" or "_vx_offset_COPROC_DESC_next";  for example: 

      Example 1
      Detailed exception task information
      ---------------------------------

      Calling Stack:
      --------------
      Task ID: 0x891a7b0
      Task Name: IntProc
      PC: 0x11e3b0c
      PendQ: 0x890a114
      SP: 0x891a0b0

      0x11e3b0c: taskSuspend (0x11e36d0) + 0x43c
      0x1018ffc: log_error_nvram (0x1018cfc) + 0x300
      0x101c240: IntProcessTask (0x101b804) + 0xa3c
      0x1137f30: vxTaskEntry (0x1137ed4) + 0x5c
      Example 2
      Detailed exception task information
      ---------------------------------

      Calling Stack:
      --------------
      Task ID: 0x8001580
      Task Name: IntProc
      PC: 0x0
      PendQ: 0x7ff0ee4
      SP: 0x0

      0x0: _vx_offset_COPROC_DESC_next (0x0) + 0x0
      Solution
      Please see 16165 for firmware upgrade recommendations.
      Photo of FAQ User

      FAQ User, Official Rep

      • 13,620 Points 10k badge 2x thumb

      Posted 5 years ago

      • 0
      • 1

      There are no replies.

      This conversation is no longer open for comments or replies.