Rate-limit

  • 20 November 2014
  • 23 replies
  • 3016 views

Userlevel 3
Hi everybody.
I have two Summit x670 (15.4.1.3) switches and I'd like to limit inbound broadcast, multicast and unknown unicast packets on specific ports. So, I've configured rate-limit to 500pps.

config port 3 rate-limit flood broadcast 500
config port 3 rate-limit flood multicast 500
config port 3 rate-limit flood unknown-destmac 500[/code]Then I see the output of "show ports 3 stat" command. I see only 10-20 pps, but Flood Rate Exceeded counter is increasing and I have log messages like
[i] Flood Rate Limiting activated on Port 3


23 replies

Userlevel 4
can you write an acl to count the ingress packets.
just to confirm what packets are seen.
show l2stats for that specific vlan.
see how many packets are multicast and how many are broadcast.
clear l2stats
show l2stats vlan
Userlevel 3
I can write an acl like.
entry BCAST {
if {
ethernet-destination-address ff:ff:ff:ff:ff:ff;
}
then {
packet-count bcast-pkt;
}
}[/code]As to "show l2stats". I don't see any broadcast or multicast counters in output of this command. I see only
Bridge interface on VLAN Default:
Total number of packets to CPU = 2923.
Total number of packets learned = 38837.
Total number of IGMP control packets snooped = 255.
Total number of IGMP data packets switched = 218.
Total number of MLD control packets snooped = 0.
Total number of MLD data packets switched = 0.[/code]

Userlevel 4
hello broadcast packets directly hit cpu.
unknown unicast [mac learning]
Total number of packets learned = 38837.
clear l2stats
show l2stats vlan -run this command 5 times with 1 sec interval to see how many
packets are hitting cpu.

Userlevel 3
Ok. But I have about 50 vlans on this port.

Anyway, this is an output of show l2stats vlan Default command.

Bridge interface on VLAN Default:
Total number of packets to CPU = 5.
Total number of packets learned = 48.

Bridge interface on VLAN Default:
Total number of packets to CPU = 8.
Total number of packets learned = 70.

Bridge interface on VLAN Default:
Total number of packets to CPU = 11.
Total number of packets learned = 104.

Bridge interface on VLAN Default:
Total number of packets to CPU = 14.
Total number of packets learned = 132.

Bridge interface on VLAN Default:
Total number of packets to CPU = 16.
Total number of packets learned = 142.

Bridge interface on VLAN Default:
Total number of packets to CPU = 18.
Total number of packets learned = 174.
Userlevel 3
This is the counter from acl

Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 43

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 43

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 50

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 62

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 69

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 70

* xCore1.49 # sh access-list counter
Policy Name Vlan Name Port Direction
Counter Name Packet Count Byte Count
==================================================================
bcast-counter * 2 ingress
bcast-pkt 83

I've run this command with 1 sec interval.

Userlevel 4
These outputs doesnt justify there is an issue.
Next steps would be to take a packet capture on the ingress.
You can do it by yourself by port mirroring or.
if you reachout to TAC they can do TCPdump ---which needs debug password to get into debug mode.

could you also check the cpu utilisation.
is any specific process is high ex.bcmrx?
Userlevel 3
Thanks. CPU utilization in normal state. I'm going to capture traffic on this interface.

Could it be the bug in XOS or something? I found this topic here https://community.extremenetworks.com/extreme/topics/problem_with_rate_limit_on_summit_x650-l8ftu

Userlevel 4
It could be but I would always do the maximum troubleshooting i can do and then reach out to TAC for confirmation.
Userlevel 3
Ok. Thanks.
Userlevel 3
Hi everybody.
It might be interesting for somebody. I've got some explanations from TAC. My problem with rate-limit is peculiarity or feature of platform.

One second is divided into 15.625 microseconds intervals. The rate-limiting mechanism occurs when the platform receives lots of packets in one 15.625 microsecond interval.

For example. I've configured
conf ports 25 rate-limit flood broadcast 100000

Rate-limiting mechanism occurs when the box receives ~1.5 packet in 15.625 microsecond.
Userlevel 7
Hi,

if you want to stick to a pps measuring, apply a meter, otherwise yes, the newer chipsets are not working anymore at the /s sampling rate. This is the reason why you may hit the rate-limit while you don't have that much of traffic on a per second time basis.

As for tcpdump (I saw it mentioned in this thread), if you plan to use it keep in mind that you're sniffing in software, so it takes potentially a lot of resources, which may have a bad side effect. So take care with that. Port mirroring is happening in hardware, it's better to use it.
Userlevel 3
Hi,

if you want to stick to a pps measuring, apply a meter, otherwise yes, the newer chipsets are not working anymore at the /s sampling rate. This is the reason why you may hit the rate-limit while you don't have that much of traffic on a per second time basis.

As for tcpdump (I saw it mentioned in this thread), if you plan to use it keep in mind that you're sniffing in software, so it takes potentially a lot of resources, which may have a bad side effect. So take care with that. Port mirroring is happening in hardware, it's better to use it.
Thanks, useful advice.
Userlevel 6
Badge
Hi,

currently i fight on the same line - means i use rate-limit (1000pps on a 1GB Link) and wonder why rate-limit exceeded and the counters are far away from this 1000pps threshold.

I use a X670V-G2. i use rate-limit to avoid loops because RSTP is to complex to setup in my environment (that it works with netlogin and is compatible to enterasys).

So i think because the new (G2 Switches) are measures based on the mentioned 15,6 micro-secs time slots (tokens) the configured CLI threshold in packets per second will never working reliable. Means the ASIC cannot measure the configured CLI limit in pps reliable.

I think there is currently (nearly) an ON - OFF Situation regarding rate-limits.If you turn it on, set it to the max value 262144 - because then max. 4 Packets per time slot (15,625 micro-secs) are allowed OR at least to a value 131072 then 2 Packets allowed.

This is step back / disadvantage compared to the G1 Switch Platform - normally we expect the Switch will offer better features!

Currently there is a KB Article which explain that:
"How to rate-limit implement on EXOS platforms?" KB2628

Are my understandings right ? If not please correct me.

Regards

Userlevel 7
Hi, I believe a fix is included in 21.1 for this feature. You might want to try if you have G2.
Userlevel 6
Badge
Hi Stephane,

i have a look at the current Release Notes 21.1.1 - and i found only these:

Configuration of Overhead-bytes in Calculating Rate-Limiting and Rate-Shaping

But this changes nothing regarding the lost or not reliable rate-limit threshold on G2 Switches.

Regards
Userlevel 7
Hi Stephane,

i have a look at the current Release Notes 21.1.1 - and i found only these:

Configuration of Overhead-bytes in Calculating Rate-Limiting and Rate-Shaping

But this changes nothing regarding the lost or not reliable rate-limit threshold on G2 Switches.

Regards
Did you test with 21.1?
Userlevel 6
Badge
Hi Stephane,

i have a look at the current Release Notes 21.1.1 - and i found only these:

Configuration of Overhead-bytes in Calculating Rate-Limiting and Rate-Shaping

But this changes nothing regarding the lost or not reliable rate-limit threshold on G2 Switches.

Regards
Hi Stephane,
i do not test because if nothing i written to the release notes engineering do not change the regarding code!

Do you think or know that anything would be changed ?
Userlevel 7
Hi Stephane,

i have a look at the current Release Notes 21.1.1 - and i found only these:

Configuration of Overhead-bytes in Calculating Rate-Limiting and Rate-Shaping

But this changes nothing regarding the lost or not reliable rate-limit threshold on G2 Switches.

Regards
From what I recall, this code has been fixed in 21.1, I had a thread with engineering on that topic. I never tested it, so that would be worth checking in a lab first.
Userlevel 7
Just to document this, I have encountered the issue with basically unusable broadcast limiters on X670 switches with different EXOS versions (15.2, 15.3, 15.6). 😞
Userlevel 7
Just to document this, I have encountered the issue with basically unusable broadcast limiters on X670 switches with different EXOS versions (15.2, 15.3, 15.6). :-(yes, this is expected because of the chipsets. Using Meters is a good way to solve that. There're match criteria for all BUM traffic types.
Userlevel 7
I have just tested broadcast rate-limits on an X620-16t switch using EXOS 21.1.1.4-patch1-2, they work fine. So there is a 10G switch, using a newer SoC, with working broadcast limiters. 🙂
Userlevel 7
I have just tested broadcast rate-limits on an X620-16t switch using EXOS 21.1.1.4-patch1-2, they work fine. So there is a 10G switch, using a newer SoC, with working broadcast limiters. :-)Ok, so the fix in 21.1 I was referring above is working. Good to hear. It should work as well on any other switches running 21.1 and above.
Userlevel 3
This is an old thread, but as new releases and hardware have seen the light of day, I think it's appropriate to add some info here.

From tests in the real world and dialogue with the TAC, there seems to be on specific platform that behaves differently and which may never have a decent rate limiting implementation, being the X430/X440 (non G2).

Referring to this article (seems revised and in the current version almost possible to understand):
https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-rate-limit-implement-on-EXOS-platform

- Measurements are per time slice, and this hardware has a time slice that is 15.625 microseconds long.
- Setting a rate of 64000 allows one packet per time slice (15.625 microseconds)
- Setting a rate of 2 * 64000 = 128000 allows two packets per time slice (15.625 microseconds)
- Setting a rate of n * 64000 allows "n" packets per time slice (15.625 microseconds)
- For each second, one extra packet is allowed due to how buffers are handled (or is this per 15.625 us time slice?)

All this is based on the fact that one packet "costs" 64000 tokens in order to to be processed and the rate limit is in reality the number of tokens that are added to the bucket every time slice. If you add 64000 tokens to the bucket every time slice, you can afford to buy one packet per interval (time slice). I assume the bucket is "emptied" at the end of every time slice and new tokens are added, and the number of tokens being the rate limit you defined.

Scenario 1:

Rate limit set to 64000.
Packets are coming in at a pace of two packets every 15.625 microseconds (not realistic, but for the sake of discussion).
Result:
One packet per 15.625 microsecond interval will be accepted and the second packet will be discarded, resulting in 64001 packets coming through (counting the one in the buffer) each second.
In this unrealistic scenario, we get the requested amount of packets per second (plus one for free 😉 ).

Scenario 2:

Rate limit set to 128000.
Small 64 byte packets come in as a bust of 10 packets, with no gap in between them, taking 6.72 microseconds in total to transmit (1 Gbps line rate).
All packets happen to arrive in the same time slice (the same 15.625 us interval).
As the limit of 128000 allows two packets per time slice and all packets arriving in the same time slice, only two packets are allowed (plus one in the buffer making it three packets).


In addition, all G2 hardware (including X620, X690 and X870) do count the packets during a whole second as expected, but only if the EXOS release is 21.2 or higher. I see this expected behavior in X450e as well, but apparently in EXOS 15.3.5.2 patch1-17.

I hope to get some confirmation or further explanation to this from the TAC.

/Fredrik

Reply