Discussion:
[c-nsp] How to find the root cause of packet loss
Anton Turygin
2010-06-18 11:56:05 UTC
Permalink
Hello,

Getting output drops and packet loss on Catalyst WS-C2960G-48TC-L.

The traffic is relatevely small but output drops are growing hundreds per
second.

GigabitEthernet0/45 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 2893.fe8e.3b2d (bia 2893.fe8e.3b2d)
MTU 9000 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 122/255, rxload 14/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 1000Mb/s, link type is auto, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:20, output hang never
Last clearing of "show interface" counters never
Input queue: 0/4096/0/0 (size/max/drops/flushes); Total output drops: 32013814
Queueing strategy: fifo
Output queue: 0/4096 (size/max)
5 minute input rate 57250000 bits/sec, 33933 packets/sec
5 minute output rate 480872000 bits/sec, 47800 packets/sec
2298025991 packets input, 456494816566 bytes, 0 no buffer
Received 294650 broadcasts (245203 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 245203 multicast, 0 pause input
0 input packets with dribble condition detected
3306257265 packets output, 4196833602160 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out



Buffers statistics:

Buffer elements:
1505 in free list (500 max allowed)
1652435 hits, 0 misses, 1024 created

Public buffer pools:
Small buffers, 104 bytes (total 34, permanent 25, peak 184 @ 18:14:40):
33 in free list (20 min, 60 max allowed)
253962 hits, 72 misses, 207 trims, 216 created
0 failures (0 no memory)
Middle buffers, 600 bytes (total 21, permanent 15, peak 81 @ 17:22:12):
19 in free list (10 min, 30 max allowed)
6054 hits, 75 misses, 219 trims, 225 created
0 failures (0 no memory)
Big buffers, 1536 bytes (total 53, permanent 50, peak 92 @ 17:22:12):
29 in free list (5 min, 150 max allowed)
315152 hits, 20 misses, 57 trims, 60 created
0 failures (0 no memory)
VeryBig buffers, 4520 bytes (total 11, permanent 10, peak 13 @ 09:49:47):
1 in free list (0 min, 100 max allowed)
37614 hits, 2 misses, 3 trims, 4 created
0 failures (0 no memory)
Large buffers, 5024 bytes (total 0, permanent 0):
0 in free list (0 min, 5 max allowed)
0 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Huge buffers, 18024 bytes (total 1, permanent 0, peak 2 @ 18:14:30):
1 in free list (0 min, 2 max allowed)
1097 hits, 1 misses, 1 trims, 2 created
0 failures (0 no memory)

Interface buffer pools:
RxQFB buffers, 2040 bytes (total 150, permanent 150):
141 in free list (0 min, 150 max allowed)
13720 hits, 0 misses
RxQ1 buffers, 2040 bytes (total 128, permanent 128):
7 in free list (0 min, 128 max allowed)
231357 hits, 10964 fallbacks
RxQ3 buffers, 2040 bytes (total 16, permanent 16):
1 in free list (0 min, 16 max allowed)
42632 hits, 2705 fallbacks
RxQ4 buffers, 2040 bytes (total 64, permanent 64):
1 in free list (0 min, 64 max allowed)
3301 hits, 51 fallbacks
RxQ6 buffers, 2040 bytes (total 64, permanent 64):
0 in free list (0 min, 64 max allowed)
155 hits, 91 misses
RxQ7 buffers, 2040 bytes (total 96, permanent 96):
30 in free list (0 min, 96 max allowed)
581219 hits, 490 misses
RxQ8 buffers, 2040 bytes (total 32, permanent 32):
0 in free list (0 min, 32 max allowed)
203101 hits, 202315 misses
RxQ10 buffers, 2040 bytes (total 16, permanent 16):
0 in free list (0 min, 16 max allowed)
16 hits, 0 fallbacks
RxQ12 buffers, 2040 bytes (total 16, permanent 16):
0 in free list (0 min, 16 max allowed)
16 hits, 0 misses
RxQ15 buffers, 2040 bytes (total 4, permanent 4):
0 in free list (0 min, 4 max allowed)
1303978 hits, 1303974 misses
RxQ11 buffers, 2040 bytes (total 4, permanent 4):
0 in free list (0 min, 4 max allowed)
4 hits, 0 misses

Header pools:




Output interpreter says:
ERROR: Since it's last reload, this router has created or maintained a
relatively
large number of 'Big buffers' yet still has very few free buffers.

ERROR: Since it's last reload, this router has created or maintained a
relatively
large number of 'VeryBig buffers' yet still has very few free buffers.

The above symptoms suggest that a buffer leak has occurred.



Upgrading IOS does not help. I even changed the switch but this did not
help either.

I also get:

#show controllers ethernet-controller gi0/45

Transmit GigabitEthernet0/45 Receive
372161753 Bytes 3922461464 Bytes
3323064978 Unicast frames 2309878058 Unicast frames
3348 Multicast frames 246432 Multicast frames
158248 Broadcast frames 49637 Broadcast frames
0 Too old frames 3251834333 Unicast bytes
0 Deferred frames 16990019 Multicast bytes
0 MTU exceeded frames 3176768 Broadcast bytes
0 1 collision frames 0 Alignment errors
0 2 collision frames 0 FCS errors
0 3 collision frames 0 Oversize frames
0 4 collision frames 0 Undersize frames
0 5 collision frames 0 Collision fragments
0 6 collision frames
0 7 collision frames 41549 Minimum size frames
0 8 collision frames 1836935154 65 to 127 byte frames
0 9 collision frames 22806228 128 to 255 byte frames
0 10 collision frames 234511817 256 to 511 byte frames
0 11 collision frames 115680563 512 to 1023 byte frames
0 12 collision frames 18333995 1024 to 1518 byte frames
0 13 collision frames 0 Overrun frames
0 14 collision frames 0 Pause frames
0 15 collision frames
0 Excessive collisions 0 Symbol error frames
0 Late collisions 0 Invalid frames, too large
0 VLAN discard frames 81864821 Valid frames, too large
0 Excess defer frames 0 Invalid frames, too small
2207 64 byte frames 0 Valid frames, too small
331806887 127 byte frames
57469089 255 byte frames 0 Too old frames
94297557 511 byte frames 0 Valid oversize frames
135809581 1023 byte frames 0 System FCS error frames
1481837111 1518 byte frames 0 RxPortFifoFull drop frame
1222004142 Too large frames
0 Good (1 coll) frames
0 Good (>1 coll) frames


There are too many "Too large frames".


Where could be the problem with the switch?


Regards,
Anton Turygin
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Peter Rathlev
2010-06-18 12:21:31 UTC
Permalink
Post by Anton Turygin
Getting output drops and packet loss on Catalyst WS-C2960G-48TC-L.
Someone should start selling T-shirts with a pun on that. :-)
Post by Anton Turygin
The traffic is relatevely small but output drops are growing hundreds per
second.
GigabitEthernet0/45 is up, line protocol is up (connected)
[...]
Post by Anton Turygin
Input queue: 0/4096/0/0 (size/max/drops/flushes); Total output drops: 32013814
Queueing strategy: fifo
Output queue: 0/4096 (size/max)
5 minute input rate 57250000 bits/sec, 33933 packets/sec
5 minute output rate 480872000 bits/sec, 47800 packets/sec
[...]

The problem is probably "micro-bursts", which the 2960 is exceptionally
bad at handling.

You can adjust some of the SRR queueing parameters. Look in the list
archive for details on the 3560, which has similar problems that have
been discussed extensively.

Or get another switch. :-)
--
Peter


_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Anton Turygin
2010-06-18 15:02:23 UTC
Permalink
Post by Peter Rathlev
Post by Anton Turygin
Getting output drops and packet loss on Catalyst WS-C2960G-48TC-L.
Someone should start selling T-shirts with a pun on that. :-)
Post by Anton Turygin
The traffic is relatevely small but output drops are growing hundreds per
second.
GigabitEthernet0/45 is up, line protocol is up (connected)
[...]
Post by Anton Turygin
Input queue: 0/4096/0/0 (size/max/drops/flushes); Total output drops: 32013814
Queueing strategy: fifo
Output queue: 0/4096 (size/max)
5 minute input rate 57250000 bits/sec, 33933 packets/sec
5 minute output rate 480872000 bits/sec, 47800 packets/sec
[...]
The problem is probably "micro-bursts", which the 2960 is exceptionally
bad at handling.
You can adjust some of the SRR queueing parameters. Look in the list
archive for details on the 3560, which has similar problems that have
been discussed extensively.
Unfortunately didn't work for me.
Post by Peter Rathlev
Or get another switch. :-)
Which model/vendor would you advice?


Regards,
Anton Turygin
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Sascha Pollok
2010-06-18 16:34:57 UTC
Permalink
Post by Peter Rathlev
Post by Anton Turygin
Getting output drops and packet loss on Catalyst WS-C2960G-48TC-L.
Someone should start selling T-shirts with a pun on that. :-)
Any idea how the EOSed 2970 performs in terms of buffers and
bursts? I have some of those in stock and wondering where to
put them next.

-Sascha

_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Peter Rathlev
2010-06-22 08:53:24 UTC
Permalink
Post by Sascha Pollok
Any idea how the EOSed 2970 performs in terms of buffers and
bursts? I have some of those in stock and wondering where to
put them next.
I just tested with a 2970 and it had no problems pushing 11+ MB/s when
transferring a 6 GB VirtualBox disk image. So it seems it does not have
the buffering problems of the 2960/3560/3750 family.

I tested between Gi0/1 and Gi0/3, so on the same ASIC. The switch was
with a blank configuration, except for "speed auto 100" on one interface
(Gi0/1). The software was 12.2(25)SEC2 LAN Base, but I don't think that
matters too much. It had 83 dropped packets from ~5 million packets.
--
Peter


_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
LM
2010-06-22 10:31:11 UTC
Permalink
Is there any command inside the switch to determine a possible packet
loss?, more than the error counters under "sh int", I am curious about
the ASIC values and buffer issues.
Post by Peter Rathlev
Post by Sascha Pollok
Any idea how the EOSed 2970 performs in terms of buffers and
bursts? I have some of those in stock and wondering where to
put them next.
I just tested with a 2970 and it had no problems pushing 11+ MB/s when
transferring a 6 GB VirtualBox disk image. So it seems it does not have
the buffering problems of the 2960/3560/3750 family.
I tested between Gi0/1 and Gi0/3, so on the same ASIC. The switch was
with a blank configuration, except for "speed auto 100" on one interface
(Gi0/1). The software was 12.2(25)SEC2 LAN Base, but I don't think that
matters too much. It had 83 dropped packets from ~5 million packets.
_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Peter Rathlev
2010-06-22 11:26:37 UTC
Permalink
Post by LM
Is there any command inside the switch to determine a possible packet
loss?, more than the error counters under "sh int", I am curious about
the ASIC values and buffer issues.
The 2970 has the "show platform pm if-numbers" and "show platform
port-asic stats drop asic <X>" to give you that kind of information,
just like the 3560/3750.

The rest of this (long) mail is the output from the switch a switch that
has dropped 12 frames on Gi0/1.

Switch#show platform pm if-numbers

interface gid gpn lpn port slot unit slun port-type lpn-idb gpn-idb
---------------------------------------------------------------------------------
Gi0/1 1 1 1 1/3 1 1 1 local Yes Yes
Gi0/2 2 2 2 1/2 1 2 2 local Yes Yes
Gi0/3 3 3 3 1/0 1 3 3 local Yes Yes
Gi0/4 4 4 4 1/1 1 4 4 local Yes Yes
Gi0/5 5 5 5 2/3 1 5 5 local Yes Yes
Gi0/6 6 6 6 2/2 1 6 6 local Yes Yes
Gi0/7 7 7 7 2/0 1 7 7 local Yes Yes
Gi0/8 8 8 8 2/1 1 8 8 local Yes Yes
Gi0/9 9 9 9 0/3 1 9 9 local Yes Yes
Gi0/10 10 10 10 0/2 1 10 10 local Yes Yes
Gi0/11 11 11 11 0/0 1 11 11 local Yes Yes
Gi0/12 12 12 12 0/1 1 12 12 local Yes Yes
Gi0/13 13 13 13 3/3 1 13 13 local Yes Yes
[...]

Switch#sh platform port-asic stats drop asic 1

Port-asic Port Drop Statistics - Summary
========================================
RxQueue 0 Drop Stats: 0
RxQueue 1 Drop Stats: 0
RxQueue 2 Drop Stats: 0
RxQueue 3 Drop Stats: 0

Port 0 TxQueue Drop Stats: 0
Port 1 TxQueue Drop Stats: 0
Port 2 TxQueue Drop Stats: 0
Port 3 TxQueue Drop Stats: 12

Supervisor TxQueue Drop Statistics
Queue 0: 0
Queue 1: 0
Queue 2: 0
Queue 3: 0
Queue 4: 0
Queue 5: 0
Queue 6: 0
Queue 7: 0
Queue 8: 0
Queue 9: 0
Queue 10: 0
Queue 11: 0
Queue 12: 0
Queue 13: 0
Queue 14: 0
Queue 15: 0

Port-asic Port Drop Statistics - Details
========================================
RxQueue Drop Statistics
Queue 0
Weight 0 Frames: 0
Weight 1 Frames: 0
Weight 2 Frames: 0
Queue 1
Weight 0 Frames: 0
Weight 1 Frames: 0
Weight 2 Frames: 0
Queue 2
Weight 0 Frames: 0
Weight 1 Frames: 0
Weight 2 Frames: 0
Queue 3
Weight 0 Frames: 0
Weight 1 Frames: 0
Weight 2 Frames: 0

Port 0 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0

Port 1 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0

Port 2 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0

Port 3 TxQueue Drop Statistics
Queue 0
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 1
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 2
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 0
Queue 3
Weight 0 Frames 0
Weight 1 Frames 0
Weight 2 Frames 12
Supervisor TxQueue Drop Statistics
Queue 0: 0
Queue 1: 0
Queue 2: 0
Queue 3: 0
Queue 4: 0
Queue 5: 0
Queue 6: 0
Queue 7: 0
Queue 8: 0
Queue 9: 0
Queue 10: 0
Queue 11: 0
Queue 12: 0
Queue 13: 0
Queue 14: 0
Queue 15: 0
Switch#sh ver
Cisco IOS Software, C2970 Software (C2970-LANBASEK9-M), Version 12.2(25)SEC2, RELEASE SOFTWARE (fc1)
Copyright (c) 1986-2005 by Cisco Systems, Inc.
Compiled Wed 31-Aug-05 10:12 by antonino

ROM: Bootstrap program is C2970 boot loader
BOOTLDR: C2970 Boot Loader (C2970-HBOOT-M) Version 12.1(14r)EA1a, RELEASE SOFTWARE (fc1)

Switch uptime is 22 minutes
System returned to ROM by power-on
System image file is "flash:/c2970-lanbasek9-mz.122-25.SEC2.bin"
[...]

HTH.
--
Peter


_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
t***@gmail.com
2010-06-18 16:46:55 UTC
Permalink
Toss a pair of hosts, one at gig, one at faste, on the 2970 -- then run iperf -c -P 50 / -s on either host, and tell *us* what you see for discards out the slower of the two interfaces.

If you've got the gear, it should seem that the best information might be from actual testing vs non-existent specs.

------Original Message------
From: Sascha Pollok
Sender: cisco-nsp-***@puck.nether.net
To: Peter Rathlev
Cc: cisco-***@puck.nether.net
Subject: Re: [c-nsp] How to find the root cause of packet loss
Sent: Jun 18, 2010 12:34 PM
Post by Peter Rathlev
Post by Anton Turygin
Getting output drops and packet loss on Catalyst WS-C2960G-48TC-L.
Someone should start selling T-shirts with a pun on that. :-)
Any idea how the EOSed 2970 performs in terms of buffers and
bursts? I have some of those in stock and wondering where to
put them next.

-Sascha

_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


-Tk

_______________________________________________
cisco-nsp mailing list cisco-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
Continue reading on narkive:
Loading...