QoS on ERS 4900 HW:03

  • 1
  • 1
  • Problem
  • Updated 4 days ago
We have got complaints from users about low network performance.  The users are behind newly deployed ERS 4900 switches with 10Gbps uplinks.  After some investigation we have addressed it as a QoS issue.  When queue-set other than 1 is set it is not possible to use the whole bandwidth of the access port.  For example, if you set queue-set to 4, the speed of 1Gbps port is shaped to 250Mbps (1000/4) and it is not possible to reach higher speed.  This problem can be reproduced only on switches with HW revision 03.  We also have switches with HW02 and they behave correctly.

Is there anyone else experiencing the same issue?

As a workaround we set only one queue which technically disables QoS, but the users are much happier.
Photo of Martin Sebek

Martin Sebek

  • 100 Points 100 badge 2x thumb

Posted 4 days ago

  • 1
  • 1
Photo of Roger Lapuh

Roger Lapuh, Employee

  • 440 Points 250 badge 2x thumb
Martin, let me do some digging here. Roger
Photo of Ludovico Stevens

Ludovico Stevens, Employee

  • 260 Points 250 badge 2x thumb
Hello Martin
I'm not aware of such an issue.
You should open a ticket with our GTAC.
Best regards
Ludovico
Photo of Martin Sebek

Martin Sebek

  • 100 Points 100 badge 2x thumb
Hello guys, thanks for the quick reply.  I have already opened a case.  This post was just to help others who might eventually experience the same issue.

Take care!
Photo of Joseph Wood

Joseph Wood

  • 112 Points 100 badge 2x thumb
Latest code?

If you can get the same workaround affect with queue-set 2, that would be better.   Setting it to 1 lumps control traffic and best effort traffic in the same queue and it becomes a free for all.

What is your qos agent buffer setting on the HW02 and HW03 switches.   I believe the default is Large.   Sometimes people set it to Maximum (if only because Maximum sounds better) but that causes more problems than it fixes especially on switches with low speed or older devices.   Certainly if you queue set and agent buffers aren't the same then it is not a comparison.

Are you seeing Dropped on No Resources incrementing when you are testing?

If you have an extra HW03 switch you can test with (not in production), then I would recommend you factory default it and then test with just a sender and receiver.   If you see the same behavior then it is a bug and Extreme support should be able to reproduce it easily.
Photo of Martin Sebek

Martin Sebek

  • 100 Points 100 badge 2x thumb
Yes, version 7.6.0.

But we have been digging more into this and found out it is not tied to a specific HW revision.  We were misled by a fact that the problem could not be reproduced on the switches with HW revision 02.  Actually the difference between those switches is in uplink speed.  When the switch is connected to a core via 1G SFP it performs as expected.  When we use 10G SFP+ the speed drops down.

And also it cannot be reproduced with factory default (boot default) configuration.  When only basic set of commands (set switch ip, create two vlans, set tagging for uplink and membership for the ports) is used the performance is affected.

Yes, we are seeing Dropped on No Resources counter increasing on access port when the issue exists.