Network Architecture & Design

Expand all | Collapse all

BD-12804: Slot (GM-20XTR) turns off with strange Erro:HAL.Card.Error

  • 1.  BD-12804: Slot (GM-20XTR) turns off with strange Erro:HAL.Card.Error

    Posted 06-29-2017 18:10
    Today after 2 years of uptime one of our Slots on BD-12804 had been turned off for no apparent reason. We had got some strange logs before the accident:
    06/29/2017 20:06:29.22


  • 2.  RE: BD-12804: Slot (GM-20XTR) turns off with strange Erro:HAL.Card.Error

    Posted 06-29-2017 18:28
    Looks like it can be a hardware issue, e.g. broken capacitors due to overheat


  • 3.  RE: BD-12804: Slot (GM-20XTR) turns off with strange Erro:HAL.Card.Error

    Posted 06-29-2017 19:45
    I did some searching and found a few instances of this that were resolved with software updates, but that was in 12.0 and 12.1 versions, so 12.6 should be okay. I see a later instance where an RMA was requested for the blade and no trouble was found at the repair facility. It's hard for me to say with certainty what caused this, but you'll want to monitor for sure. Keep in mind that you're dealing with 11+ year old equipment :)


  • 4.  RE: BD-12804: Slot (GM-20XTR) turns off with strange Erro:HAL.Card.Error

    Posted 06-29-2017 19:56
    If this happens again and since you are going to reboot it to clear it up you may want to run an extended diagnostics on slot 1 and the MSM to see if there are any issues that show up. Be warned though if there are indeed bad memory or other hardware that it finds it may take the bad card offline due to the hardware problems so I would only do this if indeed you have a spare. Also be sure and have a back up of the config if you do the MSM... We only had one 12k in our network and if I recall the diagnostics is about 5 or 6 minutes per card and you have to do them one at a time.