Thanks for all the feedback.
I have seen failure due to timeouts, which I theorized could have been due to re-transmits.
We also had slot 2 of a 3 slot stack fail after code was pushed out. I have not had access to fully trouble shoot this failure. I thought it possible that code made it to the other slots and the stack rebooted due to a power event there would have been a code miss-match which could explain the slot failure.
Unfortunately I can not reboot the slots unless I am onsite because the customer is concerned that some switches wont make it back.
You're saying the overs are not something I should be concerned with?
Is there a way to check that the stack ports are properly configured for jumbo frames?
Are there any commands I could run that might reduce the numbers I am seeing?
Here are about 16 hours worth of errors:
We did reboot a building last night while we were onsite installing a 10g TOR backbone and those stacks appear to be running clean now.
I hate to ask, but if the reboot does fix the issue how often should they be rebooted? Two of the stacks in the image above have only been up 20 days.