Kern.Alert>CPU 1 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == ffffffffc057a1b0


Userlevel 3
Crit:Kern.Alert> CPU 1 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == ffffffffc057a1b0

We have an X480-48X switch , we have changed the hardware by RMA and upgraded the code to the advised level ( previous 16.1.3.6-patch1-11 ), we have had a TAC case open, even with this we are still seeing ramdom reboots.

Full version :

xpctRebootDtect> Booting after System Failure.03/15/2017 15:41:41.84 Changing to watchdog warm reset mode
03/15/2017 15:39:52.13 CPU 1: Kernel thread was stuck for 3.05 seconds, jiffies: 4435977008
03/15/2017 15:39:52.13 CPU 0: Kernel thread was stuck for 2.85 seconds, jiffies: 4435976997
03/15/2017 15:39:52.13 Fatal exception: panic in 5 seconds
03/15/2017 15:39:52.12 CPU 1 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == ffffffffc057a1b0

This unit is configured as a Supervlan , and is internet facing.
Prior to this I did see in the log a number of outside attempts to access the switch management , this was stopped by our Access-policy.

8 replies

Userlevel 6
Hello Rod,

What switch and version?
Userlevel 3
Patrick Voss wrote:

Hello Rod,

What switch and version?

Sorry for the delay X480-48X

16.1.3.6-patch1-11 ( as advised by original TAC case )
Userlevel 6
Sorry Rod,

I missed this information...my bad.

Let me look into it for you.
Userlevel 4
Rod, I think you should share the dmesg o/p with TAC engineer to further troubleshoot issue. It seems the issue you were facing in old version is not resolved in new version tool

Looking at above o/p it's difficult to understand why code is referencing at VA 0000000000000000, epc == 0000000000000000
Userlevel 4
Note to TAC:

TAC needs to check if there is any invalid pointer present in SuperVlan's code that is trying to access invalid address. In the error message there is also the stack, take a look at it in order to identify where is the error.
Userlevel 3
We do have a basic miss configured Supervlan ( in different VR's ) its historical , and we are trying to address this .. the swicth was stable for years ( well 2 ) then we started to see these random reboots ..
Userlevel 3
The switch configuration has now been modified so that all sub vlans are in the same VR as the super vlan .. we will continue to monitor and see if the switch reboots again without warning.
Could this also happen with EAPS config's ?

Reply