Discussion:
[c-nsp] monitoring switch stacks
Alan Buxey
2009-10-14 18:19:10 UTC
Permalink
hi,

just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing. I'm looking
at maybe SNMP but support for MIBS in stacks seems somewhat poor
and that leaves the syslog method to alert when something happens
to the stack... best practice?

alan
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Dale W. Carder
2009-10-14 18:34:42 UTC
Permalink
Post by Alan Buxey
just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing. I'm looking
at maybe SNMP but support for MIBS in stacks seems somewhat poor
They show up fine, at least on recent code. On earlier
versions of code (2 years ago or so), it was very buggy
and was not reliable.

We monitor the following. There have been occasions when
the switch stack ports fail and this caught it.

Cheers,
Dale

IF-MIB::ifDescr.5365 = STRING: StackPort1
IF-MIB::ifDescr.5366 = STRING: StackSub-St1-1
IF-MIB::ifDescr.5367 = STRING: StackSub-St1-2
IF-MIB::ifDescr.5368 = STRING: StackPort2
IF-MIB::ifDescr.5369 = STRING: StackSub-St2-1
IF-MIB::ifDescr.5370 = STRING: StackSub-St2-2
IF-MIB::ifDescr.5371 = STRING: StackPort3
IF-MIB::ifDescr.5372 = STRING: StackSub-St3-1
IF-MIB::ifDescr.5373 = STRING: StackSub-St3-2

IF-MIB::ifOperStatus.5365 = INTEGER: up(1)
IF-MIB::ifOperStatus.5366 = INTEGER: up(1)
IF-MIB::ifOperStatus.5367 = INTEGER: up(1)
IF-MIB::ifOperStatus.5368 = INTEGER: up(1)
IF-MIB::ifOperStatus.5369 = INTEGER: up(1)
IF-MIB::ifOperStatus.5370 = INTEGER: up(1)
IF-MIB::ifOperStatus.5371 = INTEGER: up(1)
IF-MIB::ifOperStatus.5372 = INTEGER: up(1)
IF-MIB::ifOperStatus.5373 = INTEGER: up(1)

CISCO-STACKWISE-MIB::cswSwitchState.1001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.2001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.3001 = INTEGER: ready(4)

_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Ge Moua
2009-10-14 18:59:45 UTC
Permalink
Dale Carder-
Are you guys also monitoring queue drops on the interfaces too; if so
can you forward me the OID?

Regards,
Ge Moua | Email: ***@umn.edu

Network Design Engineer
University of Minnesota | Networking & Telecommunications Services
Post by Dale W. Carder
Post by Alan Buxey
just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing. I'm looking
at maybe SNMP but support for MIBS in stacks seems somewhat poor
They show up fine, at least on recent code. On earlier
versions of code (2 years ago or so), it was very buggy
and was not reliable.
We monitor the following. There have been occasions when
the switch stack ports fail and this caught it.
Cheers,
Dale
IF-MIB::ifDescr.5365 = STRING: StackPort1
IF-MIB::ifDescr.5366 = STRING: StackSub-St1-1
IF-MIB::ifDescr.5367 = STRING: StackSub-St1-2
IF-MIB::ifDescr.5368 = STRING: StackPort2
IF-MIB::ifDescr.5369 = STRING: StackSub-St2-1
IF-MIB::ifDescr.5370 = STRING: StackSub-St2-2
IF-MIB::ifDescr.5371 = STRING: StackPort3
IF-MIB::ifDescr.5372 = STRING: StackSub-St3-1
IF-MIB::ifDescr.5373 = STRING: StackSub-St3-2
IF-MIB::ifOperStatus.5365 = INTEGER: up(1)
IF-MIB::ifOperStatus.5366 = INTEGER: up(1)
IF-MIB::ifOperStatus.5367 = INTEGER: up(1)
IF-MIB::ifOperStatus.5368 = INTEGER: up(1)
IF-MIB::ifOperStatus.5369 = INTEGER: up(1)
IF-MIB::ifOperStatus.5370 = INTEGER: up(1)
IF-MIB::ifOperStatus.5371 = INTEGER: up(1)
IF-MIB::ifOperStatus.5372 = INTEGER: up(1)
IF-MIB::ifOperStatus.5373 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswSwitchState.1001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.2001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.3001 = INTEGER: ready(4)
_______________________________________________
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Dale W. Carder
2009-10-14 19:18:26 UTC
Permalink
Hey Ge!

We monitor for input queue drops on 6500's with this oid:

.1.3.6.1.4.1.9.9.276.1.1.1.1.10

Our alert for the NOC is drops > 100/sec results in a
major alarm. Usually it's something stupid happening on
a given vlan that needs to be beat down. For SVI's, this
goes hand in hand with punts causing cpu exhaustion on
these wimpy RP's.

I've thought about watching output queue drops, but am not
sure how to how to differentiate normal from abnormal.

Dale
Post by Ge Moua
Dale Carder-
Are you guys also monitoring queue drops on the interfaces too; if
so can you forward me the OID?
Regards,
Network Design Engineer
University of Minnesota | Networking & Telecommunications Services
Post by Dale W. Carder
Post by Alan Buxey
just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing. I'm looking
at maybe SNMP but support for MIBS in stacks seems somewhat poor
They show up fine, at least on recent code. On earlier
versions of code (2 years ago or so), it was very buggy
and was not reliable.
We monitor the following. There have been occasions when
the switch stack ports fail and this caught it.
Cheers,
Dale
IF-MIB::ifDescr.5365 = STRING: StackPort1
IF-MIB::ifDescr.5366 = STRING: StackSub-St1-1
IF-MIB::ifDescr.5367 = STRING: StackSub-St1-2
IF-MIB::ifDescr.5368 = STRING: StackPort2
IF-MIB::ifDescr.5369 = STRING: StackSub-St2-1
IF-MIB::ifDescr.5370 = STRING: StackSub-St2-2
IF-MIB::ifDescr.5371 = STRING: StackPort3
IF-MIB::ifDescr.5372 = STRING: StackSub-St3-1
IF-MIB::ifDescr.5373 = STRING: StackSub-St3-2
IF-MIB::ifOperStatus.5365 = INTEGER: up(1)
IF-MIB::ifOperStatus.5366 = INTEGER: up(1)
IF-MIB::ifOperStatus.5367 = INTEGER: up(1)
IF-MIB::ifOperStatus.5368 = INTEGER: up(1)
IF-MIB::ifOperStatus.5369 = INTEGER: up(1)
IF-MIB::ifOperStatus.5370 = INTEGER: up(1)
IF-MIB::ifOperStatus.5371 = INTEGER: up(1)
IF-MIB::ifOperStatus.5372 = INTEGER: up(1)
IF-MIB::ifOperStatus.5373 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswSwitchState.1001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.2001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.3001 = INTEGER: ready(4)
_______________________________________________
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Ge Moua
2009-10-14 19:46:53 UTC
Permalink
Dale, are you guys monitoring queue drops on the edge switches like a
Cisco 3750? If so, I'm thinking the OID will be slightly different?
Thanks for the reply !

Regards,
Ge Moua | Email: ***@umn.edu

Network Design Engineer
University of Minnesota | Networking & Telecommunications Services
Post by Dale W. Carder
Hey Ge!
.1.3.6.1.4.1.9.9.276.1.1.1.1.10
Our alert for the NOC is drops > 100/sec results in a
major alarm. Usually it's something stupid happening on
a given vlan that needs to be beat down. For SVI's, this
goes hand in hand with punts causing cpu exhaustion on
these wimpy RP's.
I've thought about watching output queue drops, but am not
sure how to how to differentiate normal from abnormal.
Dale
Post by Ge Moua
Dale Carder-
Are you guys also monitoring queue drops on the interfaces too; if so
can you forward me the OID?
Regards,
Network Design Engineer
University of Minnesota | Networking & Telecommunications Services
Post by Dale W. Carder
Post by Alan Buxey
just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing. I'm looking
at maybe SNMP but support for MIBS in stacks seems somewhat poor
They show up fine, at least on recent code. On earlier
versions of code (2 years ago or so), it was very buggy
and was not reliable.
We monitor the following. There have been occasions when
the switch stack ports fail and this caught it.
Cheers,
Dale
IF-MIB::ifDescr.5365 = STRING: StackPort1
IF-MIB::ifDescr.5366 = STRING: StackSub-St1-1
IF-MIB::ifDescr.5367 = STRING: StackSub-St1-2
IF-MIB::ifDescr.5368 = STRING: StackPort2
IF-MIB::ifDescr.5369 = STRING: StackSub-St2-1
IF-MIB::ifDescr.5370 = STRING: StackSub-St2-2
IF-MIB::ifDescr.5371 = STRING: StackPort3
IF-MIB::ifDescr.5372 = STRING: StackSub-St3-1
IF-MIB::ifDescr.5373 = STRING: StackSub-St3-2
IF-MIB::ifOperStatus.5365 = INTEGER: up(1)
IF-MIB::ifOperStatus.5366 = INTEGER: up(1)
IF-MIB::ifOperStatus.5367 = INTEGER: up(1)
IF-MIB::ifOperStatus.5368 = INTEGER: up(1)
IF-MIB::ifOperStatus.5369 = INTEGER: up(1)
IF-MIB::ifOperStatus.5370 = INTEGER: up(1)
IF-MIB::ifOperStatus.5371 = INTEGER: up(1)
IF-MIB::ifOperStatus.5372 = INTEGER: up(1)
IF-MIB::ifOperStatus.5373 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswSwitchState.1001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.2001 = INTEGER: ready(4)
CISCO-STACKWISE-MIB::cswSwitchState.3001 = INTEGER: ready(4)
_______________________________________________
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Kevin Graham
2009-10-14 19:37:22 UTC
Permalink
Post by Alan Buxey
just wondered what folk did out there to monitor switch stacks
(eg stackwise+ switch stacks like 3750e, 2975gs etc (not the older
gigastack ones....) ) - using the basic methods such as ICMP will
only show the presence of connectivity to the stack but not the
actual health of the stack - eg one member is missing.
Perhaps the easiest for this is walking:

CISCO-STACKWISE-MIB::cswStackPortOperStatus

...to handle non-stackables or non-stacked stackables, we ignore
on a NOSUCHOBJECT or where row count is <= 2. After those
conditions are satisfied, its reasonable to alert on >= 1 ports
in a state other than up(1).

Stack ports from missing members will still be there (and
obviously in a non-up operStatus), so you catch that state as
well as a partially-cabled stack.

For example, this is a 2 switch stack with a missing member:

CISCO-STACKWISE-MIB::cswStackPortOperStatus.5180 = INTEGER: down(2)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5181 = INTEGER: down(2)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5183 = INTEGER: down(2)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5184 = INTEGER: down(2)

...and this is a 4 switch stack with a broken cable:

CISCO-STACKWISE-MIB::cswStackPortOperStatus.5366 = INTEGER: down(2)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5367 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5369 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5370 = INTEGER: down(2)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5372 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5373 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5375 = INTEGER: up(1)
CISCO-STACKWISE-MIB::cswStackPortOperStatus.5376 = INTEGER: up(1)

We had _really_ hoped to also catch 'flapping' stack ports by
watching ifLastChange (since healthy stack ports never change
state), but CSCsr85997 killed that idea.

I would be elated to see CISCO-SWITCH-HARDWARE-CAPACITY-MIB or
its equivalent for these switches (given their limited and
variably sized resources), but it seems unlikely that will
ever happen...
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Loading...