ExtremeSwitching (EXOS)

  • 1.  Save configuration error

    Posted 09-23-2015 10:40
    During a maintenance, after doing some changes on a stack configuration (stack composed of 2 x460 and one x440), we wanted to save the configuration on the primary so we've done the following command:

    The configuration file primary.cfg already exists.
    Do you want to save configuration to primary.cfg and overwrite it? (y/N) Yes

    After that we received this message:

    Error: This command cannot be executed during configuration save.


    But after looking, we don't find how to verify the saving process. The only thing that we have found is the last save which was tried was done by the Ridgeline server three days ago:

    [i] 09/20/2015 19:06:23.72 Slot-1: x.x.x.x (telnet) userXXX: SAVE CONFIGURATION

    But the last configuration done apparently have been made last June 1st (below output of show switch command)

    primary.cfg Created by ExtremeXOS version 15.5.3.4
    1053995 bytes saved on Mon Jun 1 01:37:34 2015

    Please, how can I verify and release the saving process (other than reboot) ?

    Thanks a lot.

    Tristan


  • 2.  RE: Save configuration error

    Posted 09-23-2015 13:56
    Tristan, are you able to save to another configuration file other than primary.cfg 'save configuration


  • 3.  RE: Save configuration error

    Posted 09-24-2015 07:47
    Hello,

    thanks for your response.

    I've tried to save the config file in an other destination as you suggest and the same error appears.

    save conf test[/code]Do you want to save configuration to test.cfg? (y/N) Yes[/code]Error: This command cannot be executed during configuration save.
    [/code]
    The show sessions command, return that my connection is the only one active :

    sh session

    # Login Time User
    ================================================================================
    *68 Thu Sep 24 10:00:53 2015 userXXX

    Next, I've removed the switch from the ridgeline Management, and it's also the same.

    Here a review of the TOP command, I don't find anything weird, anybody have an idea ?

    Load average: 7.53 7.55 7.47 3/204 11799

    PID PPID USER STAT RSS %MEM CPU %CPU COMMAND
    1451 1 root S < 26512 2.5 0 4.2 ./hal
    1243 2 root SW< 0 0.0 0 2.4 [bcmLINK.0]
    1800 2 root SW< 0 0.0 0 1.8 [bcmCNTR.0]
    1801 2 root SW< 0 0.0 1 1.6 [bcmCNTR.1]
    1475 1 root S 3612 0.3 1 0.6 ./fdb
    11799 11798 root R 852 0.0 0 0.6 top -d 3
    1787 1 root S 832 0.0 1 0.6 ./exsshd
    1246 2 root RW< 0 0.0 1 0.6 [bcmLINK.1]
    1547 1 root S 3796 0.3 0 0.3 ./acl
    1530 1 root S 3368 0.3 1 0.3 ./pim
    1088 1 root S 2716 0.2 1 0.1 /exos/bin/epm -t 40 -f /exos/config/epmrc.Edge -d /exos/config/epmdprc
    1520 1 root S 2484 0.2 0 0.1 ./rip
    1248 2 root SW< 0 0.0 0 0.1 [bcmASYNC]
    1295 2 root DW< 0 0.0 1 0.1 [tbcm_msm_tx0]
    1455 1 root S 18692 1.8 1 0.0 ./cliMaster
    1564 1 root S 10004 0.9 1 0.0 ./etmon
    1803 1 root S 5996 0.5 1 0.0 ./snmpMaster
    1461 1 root S 5508 0.5 0 0.0 ./snmpSubagent
    1457 1 root S 4768 0.4 1 0.0 ./cfgmgr
    1577 1 root S 4764 0.4 1 0.0 ./xmld
    1447 1 root S 4732 0.4 1 0.0 ./emsServer
    1604 1 root S 4612 0.4 0 0.0 ./idMgr
    1465 1 root S 4456 0.4 1 0.0 ./vlan



  • 4.  RE: Save configuration error

    Posted 09-24-2015 09:48
    Hi Tristan,

    Could you check the show switch output and see if the master and the backup nodes are in sync with each other?


  • 5.  RE: Save configuration error

    Posted 09-24-2015 12:51
    Hi Prashanth,

    you're right, on the output of "show switch" I see that slot 2 is not in Sync ("(In Sync)" on the line slot is missing as you can see on the output below).

    Slot: Slot-1 * Slot-2
    ------------------------ ------------------------
    Current State: MASTER BACKUP

    Image Selected: secondary secondary
    Image Booted: secondary secondary
    Primary ver: 15.2.3.2 15.2.3.2
    Secondary ver: 15.5.3.4 15.5.3.4
    patch1-2 patch1-2

    How can I resynchronize the two slots ?



  • 6.  RE: Save configuration error

    Posted 09-24-2015 13:00
    Hello,

    I've tried to telnet the slot 2 and apparently, this switch doesn't have synced the configuration of the stack (just for confirmation), the prompt of the cmd is as default :
    * Slot-2 Stack.1 >

    But fortunately, the stack master is the slot 1 and so the stack is still working.

    When I tried to make command like "show log" or "show conf" I've the same error :

    ERROR: ems has not finished loading its configuration, please retry command later.



  • 7.  RE: Save configuration error

    Posted 09-24-2015 13:03
    Hi Tristan,

    Thank you for sharing the details. Only way to synchronise the slots would require the reboot of the target slot.

    synchronize slot


  • 8.  RE: Save configuration error

    Posted 09-24-2015 13:28
    Just to investigate why it went to this state, share the following details:

    whats the uptime of the stack?
    Was it in Sync since the boot of the slot 2?
    Were you aware of any recent changes made to this stack like adding / modifying the details in Ridgeline etc.,

    Lets see if we can get a clue for the trigger.

    And just to set the expectation right..
    Finding the root cause might be difficult as the stack is already in the failed state and the recovery option would require a reboot.



  • 9.  RE: Save configuration error

    Posted 09-25-2015 06:04
    Prashanth, responses to your questions :

    System UpTime: 214 days 21 hours 50 minutes 19 seconds (for the stack)

    Yes the stack was synced during the last maintenance window. But I've seen that the slot 2 have reboot during the last months many time for unexpected reasons. Here the log of the last reboot :

    [i]08/24/2015 01:06:05.88 Slot-1: Module in Slot-2 is operational
    [i]08/24/2015 01:06:01.33 Slot-1: Done synching ACLs to Slot-2
    [i]08/24/2015 01:06:00.87 Slot-1: Synching ACLs to Slot-2
    08/24/2015 01:05:40.07
    [i]08/24/2015 01:05:37.05 Slot-1: Module in Slot-2 is inserted
    08/24/2015 01:02:55.57
    08/24/2015 01:02:55.56
    [i]08/24/2015 01:02:54.07 Slot-1: Slot-2 down, resetting all TCP connections to it
    [i]08/24/2015 01:02:54.07 Slot-1: Module in Slot-2 is removed
    08/24/2015 01:02:53.48
    [i]08/24/2015 01:02:53.29 Slot-1: Slot-2 down, resetting all TCP connections to it
    [i]08/24/2015 01:02:53.29 Slot-1: Module in Slot-2 is removed

    I'm not aware of changes in the last months, I've checked the show debug system-dump and nothing is present. We'll try to resynchronize the slot next Sunday, I will make a feedback when it's done.


  • 10.  RE: Save configuration error

    Posted 09-28-2015 05:03
    Last Sunday, we've made the modification for the synchronization of the slot 2, but unfortunately after the reload of this slot, it was always not synchronized, a reboot of the entire stack has been made and the synchronization is now ok.

    Thanks for your support.