ExtremeSwitching (EXOS)

Expand all | Collapse all

Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

  • 1.  Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-25-2018 11:06
    The BD 8810 rebooted on its own and lost about 30% of already learnt MAC addresses and right now its not learning any new mac addresses .

    # sh fdb stats
    Total: 253 Static: 0 Perm: 0 Dyn: 253 Dropped: 0
    FDB Aging time: 300
    FDB VPLS Aging time: 300

    # sh log
    02/23/2018 16:47:04.07


  • 2.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-25-2018 20:10
    This box is running a very old version of EXOS. I suggest that you start to upgrade to the last recommend version.
    https://extremeportal.force.com/ExtrArticleDetail?n=000002378&q=recommended%20version

    This will most likely solve your issue.



  • 3.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-26-2018 17:29
    I have a question ... Did you load a config from cli on the reboot... you are running on factory default config but have saved to your primary... you may not have all the config loaded and or some things may not have been saved from previous changes ... mac table size is only what the switch sees connections for on vlans or vmans... If you have clients connected to ports with no vlan provisioned the table will be smaller... you can also do system-dump to see what may have caused the reboot

    Config Selected: primary.cfg primary.cfg
    Config Booted: Factory Default Factory Default
    primary.cfg Created by ExtremeXOS version 12.3.3.6
    946439 bytes saved on Thu Feb 22 09:27:21 2018

    He did not say the reboot was due to old software but it is a good idea to keep things current. If this was a very static configuration and you were running for years without issues then unless there is a specific bug that caused the re-boot and is fixed by new code you still need to find out what caused the reboot..

    the command to see systemdump

    show debug system-dump



  • 4.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-27-2018 07:04
    The crash dump is pointing to a memory depletion.
    If you say that the switch was running for a very long time, you may hit CR# xos0042592. The effect of this is that the Node Manager process consumes excessive CPU usage when the system uptime reaches 994 days. Ultimately it will crash due to memory depletion. This is fixed in EXOS 12.5.3 and up.

    As Etherman correctly described the switch running with a default config and that may be the reason why you see decreased performance. Actually, I am surprised that something is working at all.

    Again I suggest to upgrade to a current version to avoid the possible memory depletion and make sure you use the correct config: "use config primary"
    With the current information we cannot determine the reason why the switch choose to use the factory default config. If you want to have that investigated you better open a case through the support portal.



  • 5.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 03-08-2018 10:16
    Hi Ron , Thank you very much . Your reply went a long way in helping me understand what was going on .

    I really appreciate


  • 6.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-25-2018 20:10
    Thank you very much for swift response . I would just like to know what you want me to upgrade, The Extreme OS or BootROM or Both ?

    I would also like to know to explain this behavior of the switch ? Does it mean that when the OS gets too old it stops learning Mac Addresses ?

    Thank you again , you have already been of great help.


  • 7.  RE: Bd 8810 Restarted and Lost 30% of Mac Addresses and now Not Learning New Mac Addresses

    Posted 02-26-2018 17:29

    Hi EtherMan , I really appreciate your prompt feedback as this is a very urgent issue . I have copied the system dump and pasted here . I can see the reason for failure is a process crash and the process is nodemgr . Would you know any reason why the process manager would cause a reboot. I am guessing the switch went to factory default judging from the output of "Show Switch".

    And also, no I didn't load any config on the cli after the reboot. I still do not know what caused the reboot because the device has been running for years without issues like you said.

    # show debug system-dump
    ===============================================
    MSM-A system dump information
    ===============================================
    core_dump_info storage: 8/3072 used [empty]
    failure: process crash
    time: Wed Feb 21 00:20:01 2018
    process nodemgr
    pid 619
    signal 6
    $0 : z0=00000000 at=fefefeff v0=00000000 v1=00000001
    $4 : a0=0000026b a1=00000006 a2=00000001 a3=00000000
    $8 : t0=00000002 t1=2b500028 t2=2b500450 t3=2b500028
    $12: t4=00000001 t5=000001d9 t6=000007e2 t7=2abc8414
    $16: s0=0000026b s1=00000402 s2=00000006 s3=2abdb344
    $20: s4=100007b4 s5=00412cf0 s6=100238e4 s7=100238e0
    $24: t8=00000113 t9=2b22af80
    $28: gp=2b3a9b40 sp=7f7ff890 s8=00000193 ra=2ab4b1f4
    Hi : 00000285
    Lo : 00033829
    epc : 2b22af94 Not tainted
    Status: 00001f13
    Cause : 00808020
    7f7ff890: 00000000 2aba07d0 2b27e704 2b500010 2aba07d0 2b39d1d0 00000006 2abdb180
    7f7ff8b0: 00000163 2aba07d0 2ab4bae4 2ab4bac8 00000000 00000000 2abdb2c4 2abdb180
    7f7ff8d0: 2aba07d0 2abdb344 2abdb2c4 2aba07d0 2b22cec8 2b22cf3c 2aba07d0 2ab4f10c
    7f7ff8f0: 7f7ffa20 7f7ffa48 2b3a9b40 00412cf0 2aba07d0 000000a5 00000000 000000a5
    7f7ff910: 2b500638 000000a5 000000a5 2aba07d0 2b27c48c 00000000 ffffffff 000000a5
    7f7ff930: 2b3a9b40 2b27dae8 2b3a9b40 2b500508 2b39f928 000000a5 2b500638 000000a5
    7f7ff950: 2b3a9b40 2b27b4b8 2b27bb64 00000000 2aba07d0 00000000 2aba07d0 2b39f850
    7f7ff970: 7f7ffc00 2b39f850 00000001 2abdb344 2aba07d0 2aba07d0 2aba07d0 2ab48390
    7f7ff990: 00000020 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7f7ff9b0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7f7ff9d0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7f7ff9f0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7f7ffa10: 2b3a9b40 2b223040 2ab21140 2b34d6c0 7fff7aef 2b34d6e8 2abdb180 00000163
    7f7ffa30: 2abdb2c4 2b34d6e8 2abdb344 2aace46c 2b3a9b40 2abdb2c4 2b500638 2abdb2c4
    7f7ffa50: 2abdb2c4 10023f70 2abdb180 2abdb344 2b3a9b40 2abc8414 00000000 00000000
    7f7ffa70: 00000000 00000000 2abdb344 2abdb180 00000163 2abdb2c4 2abdb344 0034f2b0
    log: ... transaction through muxes was interrupted, clean up
    log: <7>Opening app watchdog timer, instance: 1
    log: <7>Application watchdog timer is not cleanup, instance: 1
    log: <7>Opening app watchdog timer, instance: 1
    log: <7>Application watchdog timer is not cleanup, instance: 1

    Text segment map
    0x00400000-0x00414000 /exos/bin/nodemgr
    0x2aac0000-0x2aada000 /lib/ld-2.2.5.so
    0x2ab40000-0x2ab52000 /lib/libpthread-0.9.so
    0x2abc0000-0x2abdd000 /exos/lib/libcommon.so
    0x2ac40000-0x2ac71000 /exos/lib/libipml.so
    0x2acc0000-0x2acd2000 /exos/lib/libepm.so
    0x2ad40000-0x2ad4f000 /exos/lib/libds.so
    0x2adc0000-0x2ae1c000 /exos/lib/libdm.so
    0x2ae80000-0x2ae82000 /exos/lib/libnm.so
    0x2af00000-0x2af06000 /exos/lib/libcli.so
    0x2af80000-0x2afa4000 /exos/lib/libexpat.so
    0x2b000000-0x2b018000 /exos/lib/libcmbackend.so
    0x2b080000-0x2b08c000 /exos/lib/libems.so
    0x2b100000-0x2b117000 /exos/lib/libdispatch.so
    0x2b180000-0x2b18c000 /exos/lib/libwkninfo.so
    0x2b200000-0x2b35d000 /lib/libc-2.2.5.so
    0x2b3c0000-0x2b3c3000 /lib/libdl-2.2.5.so
    0x2b440000-0x2b441000 /exos/lib/libusertrace.so
    failure: process crash
    time: Wed Feb 21 00:20:02 2018
    process nodemgr
    pid 403
    signal 6
    $0 : z0=00000000 at=10001f00 v0=00000004 v1=00000001
    $4 : a0=00001091 a1=00000009 a2=7fff7650 a3=00000001
    $8 : t0=00001f00 t1=00000000 t2=00000000 t3=8032a060
    $12: t4=886c8480 t5=886c8500 t6=886c8400 t7=00000058
    $16: s0=7fff77c0 s1=1000ec50 s2=7fff7880 s3=7fff7758
    $20: s4=00000009 s5=7fff7880 s6=10011660 s7=00000000
    $24: t8=00000000 t9=2b2f4130
    $28: gp=2b3a9b40 sp=7fff7600 s8=2b156e40 ra=2ac6b7d0
    Hi : 0000007f
    Lo : ef9db29d
    epc : 2b2f4144 Not tainted
    Status: 00001f13
    Cause : 00808020
    7fff7600: 2add0c6c 2add0be4 2b3a9b40 2abc6b78 0000012c 10014ce8 2acb8800 00000400
    7fff7620: 2acb8800 2ac650d0 10032002 2b114b1c 2b115144 2b114cc0 100115c0 2b114b1c
    7fff7640: 2aba07d0 2b115144 2acb8800 2ac24830 2aba07d0 2b103504 2ad194c0 2ae647d0
    7fff7660: 2aba07d0 10024e0c 00000000 2aba07d0 7fff77c0 1000ec50 7fff7880 7fff7748
    7fff7680: 10011660 00000000 2b156e80 10011660 2acb8800 2ac695cc 000000c1 000001a0
    7fff76a0: 2b15eac0 00000000 7fff77c0 7fff7758 1000c270 2b156ec0 2acb8800 00000002
    7fff76c0: 2aba07d0 10011660 1000bb90 100115c0 2b114b1c 2b114ca4 2b114cc0 10032002
    7fff76e0: 2b156e50 00000000 2aba07d0 000001bd 2ab48464 2ac24830 2b104dfc 2b104820
    7fff7700: 2aba07d0 0000002c 2aba07d0 2b156e80 2b156e50 2aba07d0 2b10470c 2b1045b4
    7fff7720: 2b156e80 10011660 2b15eac0 2b156e40 00000000 00000af0 00000000 2b156e40
    7fff7740: 0bf40218 0039ada0 0bf40218 0039ada0 00000000 00000000 00000038 2b156e40
    7fff7760: 00000001 2b156e80 00000038 2b156e40 000f423f 2b156e40 2acb8800 2b10576c
    7fff7780: 2b2826ac 00000009 2aba07d0 2add0be4 7fff77c0 2b103504 2ae5dae0 10011450
    7fff77a0: 7fff7878 10011454 2b15eac0 2b15eac0 0bf40218 0039ada0 0bf40218 000ef420
    7fff77c0: 00000000 000493e0 65706d48 656c6c6f 54696d65 72000000 00000000 00000000
    7fff77e0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    log: ...
    log: <6> 0x2b200000-0x2b35d000 /lib/libc-2.2.5.so
    log: <6> 0x2b3c0000-0x2b3c3000 /lib/libdl-2.2.5.so
    log: <6> 0x2b440000-0x2b441000 /exos/lib/libusertrace.so.0.0
    log: <6>*****
    log: <6>Start core dump: pid 619 (nodemgr) signal 6
    log: <6>End core dump: pid 619 (nodemgr) signal 6

    Text segment map
    0x00400000-0x00414000 /exos/bin/nodemgr
    0x2aac0000-0x2aada000 /lib/ld-2.2.5.so
    0x2ab40000-0x2ab52000 /lib/libpthread-0.9.so
    0x2abc0000-0x2abdd000 /exos/lib/libcommon.so
    0x2ac40000-0x2ac71000 /exos/lib/libipml.so
    0x2acc0000-0x2acd2000 /exos/lib/libepm.so
    0x2ad40000-0x2ad4f000 /exos/lib/libds.so
    0x2adc0000-0x2ae1c000 /exos/lib/libdm.so
    0x2ae80000-0x2ae82000 /exos/lib/libnm.so
    0x2af00000-0x2af06000 /exos/lib/libcli.so
    0x2af80000-0x2afa4000 /exos/lib/libexpat.so
    0x2b000000-0x2b018000 /exos/lib/libcmbackend.so
    0x2b080000-0x2b08c000 /exos/lib/libems.so
    0x2b100000-0x2b117000 /exos/lib/libdispatch.so
    0x2b180000-0x2b18c000 /exos/lib/libwkninfo.so
    0x2b200000-0x2b35d000 /lib/libc-2.2.5.so
    0x2b3c0000-0x2b3c3000 /lib/libdl-2.2.5.so
    0x2b440000-0x2b441000 /exos/lib/libusertrace.so