I have seen failure due to timeouts, which I theorized could have been due to re-transmits.
We also had slot 2 of a 3 slot stack fail after code was pushed out. I have not had access to fully trouble shoot this failure. I thought it possible that code made it to the other slots and the stack rebooted due to a power event there would have been a code miss-match which could explain the slot failure.
Unfortunately I can not reboot the slots unless I am onsite because the customer is concerned that some switches wont make it back.
You're saying the overs are not something I should be concerned with?
Is there a way to check that the stack ports are properly configured for jumbo frames?
Are there any commands I could run that might reduce the numbers I am seeing?
Here are about 16 hours worth of errors:
We did reboot a building last night while we were onsite installing a 10g TOR backbone and those stacks appear to be running clean now.
I hate to ask, but if the reboot does fix the issue how often should they be rebooted? Two of the stacks in the image above have only been up 20 days.
You may be running into another unrelated issue on the stack. I know that CPU utilization can become high based on up time on older versions of code. Would it be possible to look at top and see if it is running high. A reboot may be needed before the upgrade is attempted.
Like Joe said, the RX over errors indicate packets larger than the configured MTU size were recieved.
Normally, jumbo frames should be enabled on the stack ports internally with the max MTU size, and it should not be possible to disable it on the stack ports.
Since they have been up for so long and are running an old version of EXOS, it might be worth a shot to just reboot the stack then try to upgrade it after the reboot. It seems like the port config on the stack ports may be stuck in some odd state.