Userlevel 1
During a maintenance, after doing some changes on a stack configuration (stack composed of 2 x460 and one x440), we wanted to save the configuration on the primary so we've done the following command:

The configuration file primary.cfg already exists.
Do you want to save configuration to primary.cfg and overwrite it? (y/N) Yes

After that we received this message:

Error: This command cannot be executed during configuration save.

But after looking, we don't find how to verify the saving process. The only thing that we have found is the last save which was tried was done by the Ridgeline server three days ago:

[i] 09/20/2015 19:06:23.72 Slot-1: x.x.x.x (telnet) userXXX: SAVE CONFIGURATION

But the last configuration done apparently have been made last June 1st (below output of show switch command)

primary.cfg Created by ExtremeXOS version
1053995 bytes saved on Mon Jun 1 01:37:34 2015

Please, how can I verify and release the saving process (other than reboot) ?

Thanks a lot.


Userlevel 5
Tristan, are you able to save to another configuration file other than primary.cfg 'save configuration ' ?

Also is there any other telnet/ssh sessions open or polling that could be going on preventing access to resources?

Review the 'top' output for CLI, SSH, SNMP, and other high process usage.
Userlevel 1

thanks for your response.

I've tried to save the config file in an other destination as you suggest and the same error appears.

save conf test[/code]Do you want to save configuration to test.cfg? (y/N) Yes[/code]Error: This command cannot be executed during configuration save.
The show sessions command, return that my connection is the only one active :

sh session

# Login Time User
*68 Thu Sep 24 10:00:53 2015 userXXX

Next, I've removed the switch from the ridgeline Management, and it's also the same.

Here a review of the TOP command, I don't find anything weird, anybody have an idea ?

Load average: 7.53 7.55 7.47 3/204 11799

1451 1 root S < 26512 2.5 0 4.2 ./hal
1243 2 root SW< 0 0.0 0 2.4 [bcmLINK.0]
1800 2 root SW< 0 0.0 0 1.8 [bcmCNTR.0]
1801 2 root SW< 0 0.0 1 1.6 [bcmCNTR.1]
1475 1 root S 3612 0.3 1 0.6 ./fdb
11799 11798 root R 852 0.0 0 0.6 top -d 3
1787 1 root S 832 0.0 1 0.6 ./exsshd
1246 2 root RW< 0 0.0 1 0.6 [bcmLINK.1]
1547 1 root S 3796 0.3 0 0.3 ./acl
1530 1 root S 3368 0.3 1 0.3 ./pim
1088 1 root S 2716 0.2 1 0.1 /exos/bin/epm -t 40 -f /exos/config/epmrc.Edge -d /exos/config/epmdprc
1520 1 root S 2484 0.2 0 0.1 ./rip
1248 2 root SW< 0 0.0 0 0.1 [bcmASYNC]
1295 2 root DW< 0 0.0 1 0.1 [tbcm_msm_tx0]
1455 1 root S 18692 1.8 1 0.0 ./cliMaster
1564 1 root S 10004 0.9 1 0.0 ./etmon
1803 1 root S 5996 0.5 1 0.0 ./snmpMaster
1461 1 root S 5508 0.5 0 0.0 ./snmpSubagent
1457 1 root S 4768 0.4 1 0.0 ./cfgmgr
1577 1 root S 4764 0.4 1 0.0 ./xmld
1447 1 root S 4732 0.4 1 0.0 ./emsServer
1604 1 root S 4612 0.4 0 0.0 ./idMgr
1465 1 root S 4456 0.4 1 0.0 ./vlan
Userlevel 6
Hi Tristan,

Could you check the show switch output and see if the master and the backup nodes are in sync with each other?
Userlevel 1
Hi Prashanth,

you're right, on the output of "show switch" I see that slot 2 is not in Sync ("(In Sync)" on the line slot is missing as you can see on the output below).

Slot: Slot-1 * Slot-2
------------------------ ------------------------
Current State: MASTER BACKUP

Image Selected: secondary secondary
Image Booted: secondary secondary
Primary ver:
Secondary ver:
patch1-2 patch1-2

How can I resynchronize the two slots ?
Userlevel 1

I've tried to telnet the slot 2 and apparently, this switch doesn't have synced the configuration of the stack (just for confirmation), the prompt of the cmd is as default :
* Slot-2 Stack.1 >

But fortunately, the stack master is the slot 1 and so the stack is still working.

When I tried to make command like "show log" or "show conf" I've the same error :

ERROR: ems has not finished loading its configuration, please retry command later.
Userlevel 6
Hi Tristan,

Thank you for sharing the details. Only way to synchronise the slots would require the reboot of the target slot.

synchronize slot

This will reboot the mentioned slot number and during the bootup, it will be synchronized.

Hope this helps!
Userlevel 6
Just to investigate why it went to this state, share the following details:

whats the uptime of the stack?
Was it in Sync since the boot of the slot 2?
Were you aware of any recent changes made to this stack like adding / modifying the details in Ridgeline etc.,

Lets see if we can get a clue for the trigger.

And just to set the expectation right..
Finding the root cause might be difficult as the stack is already in the failed state and the recovery option would require a reboot.
Userlevel 1
Prashanth, responses to your questions :

System UpTime: 214 days 21 hours 50 minutes 19 seconds (for the stack)

Yes the stack was synced during the last maintenance window. But I've seen that the slot 2 have reboot during the last months many time for unexpected reasons. Here the log of the last reboot :

[i]08/24/2015 01:06:05.88 Slot-1: Module in Slot-2 is operational
[i]08/24/2015 01:06:01.33 Slot-1: Done synching ACLs to Slot-2
[i]08/24/2015 01:06:00.87 Slot-1: Synching ACLs to Slot-2
08/24/2015 01:05:40.07 Slot-1: Slot-2 being Powered ON
[i]08/24/2015 01:05:37.05 Slot-1: Module in Slot-2 is inserted
08/24/2015 01:02:55.57 Slot-3: Slot-2 FAILED (1) Not In Sync
08/24/2015 01:02:55.56 Slot-3: BACKUP NODE (Slot-2) DOWN
[i]08/24/2015 01:02:54.07 Slot-1: Slot-2 down, resetting all TCP connections to it
[i]08/24/2015 01:02:54.07 Slot-1: Module in Slot-2 is removed
08/24/2015 01:02:53.48 Slot-1: BACKUP NODE (Slot-2) DOWN
[i]08/24/2015 01:02:53.29 Slot-1: Slot-2 down, resetting all TCP connections to it
[i]08/24/2015 01:02:53.29 Slot-1: Module in Slot-2 is removed

I'm not aware of changes in the last months, I've checked the show debug system-dump and nothing is present. We'll try to resynchronize the slot next Sunday, I will make a feedback when it's done.
Userlevel 1
Last Sunday, we've made the modification for the synchronization of the slot 2, but unfortunately after the reload of this slot, it was always not synchronized, a reboot of the entire stack has been made and the synchronization is now ok.

Thanks for your support.