Monitoring Chassis and Module Status with SNMP of EOS (Enterasys) Devices


Userlevel 3
I like to use PRTG for status polling of Enterasys switches, especially S-Series and Secure Stacks.
Are there any SNMP OIDs which represent the status of modules in a chassis or the global status of a chassis or switch?

Especially, I'm interested in module or stack member failures.

Kind regrads
Christoph

9 replies

Userlevel 6
I would review the the following.
entityMIB=1.3.6.1.2.1.47
Userlevel 4
These should be helpful...

MIBs for N/S-Series Power Supply and Fan Tray Status
Displaying the CPU Utilization of the N/S-Series

MIB for determining SecureStack switch stack attributes
Viewing System Utilization on the SecureStacks
Userlevel 3
Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph
Userlevel 4
Christoph wrote:

Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph

We use SNMP traps to detect the loss of stack members and modules.
Userlevel 3
Christoph wrote:

Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph

Hey Curtis, thank for your suggestion.
I also use traps, but with PRTG it's not the very best solution. And traps are not sent if line card or stack members don't come up after a reboot or power loss.
Userlevel 4
Christoph wrote:

Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph

I personally do not think Netsight even handles stacks and modules well when it comes to loss of a card or member. Unless an alarm is generated by the device Netsight is blissfully unaware of a problem. Of course if the stack member or card never powers up at boot then an alarm is not generated.
Userlevel 2
Christoph wrote:

Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph

Hi all,
You have stumbled on a piece of deficiency that I have grumbled about before. Not only do stackables not tell you when a stack member does not come up from a power down/reset, chassis have the same issue. Worse, they forget to tell you about failed power supplies and failed fans. I got Extreme to put in traps for failing blades/stack members in newer code, so maybe with a little push and shove, we as a community can get them to do something a little smarter at reboot/restart time.

I would propose that a management module (on any type of device, red or purple) scan its hardware looking for objects that are configured but not operational. These objects should include fans, power supplies, blades, stack units, stacking cables, and anything else that only produces a single trap/log entry when that object fails. (Note that this is not a definitive list, feel free to add.)
Upon finding such (apparently) broken objects, send an alert (trap/log/syslog) message out indicating same. The gotcha here is you might need to do this a couple of times spaced say a minute apart in order to have a reasonable chance of getting through to your favorite network management system. (The uplinks need to be online to get anywhere!)

Reasonable?
James
Userlevel 4
Christoph wrote:

Hello Mike,
hello Paul,

thanks for your answers. Unfortunately, it's not exactly what I'm looking for.
I have searched through the MIBs, but I was not able to find an OID which shows whether all line cards, fabrics or stacking members are online and working.

For example the entityMIB shows all stack members even though I disconnected one.

Maybe I'm able monitor a defect by polling the serial numbers and if the value is empty to trigger an alarm. But I'm not sure whether all kind of hardware failures could be discovered with such a workaround...?

Kind regards
Christoph

Netsight can tell you if there is a change in configuration by comparing current vs backed up config. Would be nice if it would tell you if there was a change in hardware configuration. That would catch the devices and modules that fail at power up.
Userlevel 6
Lets make sure that we have the following basics covered. This is helpful only for Netsight, but may be able to utilize the concepts in other network management apps using traps.

Make sure that traps are configured first in Netsight.
https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-Configure-SNMPv3-Traps-via-NetSight...

Then make sure we got traps for switch loss enabled.
https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-create-an-alarm-in-NetSight-to-show...

Reply