Hi Martin, Mike,
there is not much packet buffer available in EXOS switches, so rate
shaping can only be very limited. That is good, since shaping creates more problems than it solves. If a voice packet arrives at the end system after the preceding packet has left the anti-jitter buffer, it is dropped. Thus there is no need to hold onto VoIP packets for more than a say 100ms. Anyway, EXOS switches do not provide that much port buffer if the port runs at 1G or higher.
Most applications deal one way or another with packet loss, but packets arriving late are more problematic. They are often dropped at the end point (no need for the already congested network to transport them), or are already re-send, with the duplicate dropped at the receiver (so why deliver it at all, the network is congested already). TCPs congestion control is often based on packet loss, late packet delivery results in too much traffic sent by TCP.
I have seen a vivid example of traffic shaping in a satellite network. The WAN optimizer delayed e.g. DNS conversations for 30 seconds, but delivered every packet. The client re-tried twice (three requests in total), 1 second apart. The server replied to all three requests, the answers were delivered in 1s intervals as well. But since the traffic shaping WAN optimizer added 30s of delay, the client had timed out already and dropped the incoming replies.
Rate limiting is done to prevent a strict priority queue from starving out other traffic. This should be done for e.g. voice traffic at the access ports, possibly at highly oversubscribed uplinks, and at the WAN edge.
Brittle protocols like FCoE might profit from a finely tuned QoS setup and big buffers, but the only way to get them to fly is a sufficiently over-provisioned network.
Automatic traffic shaping as done by short-term oversubscribed switch ports (2 times 1G sending to one 1G) utilizing the port buffers works fine (unless the buffers are bloated -- see
http://www.bufferbloat.net/). Traffic shaping (or rather packet pacing) by using 100M access ports for video cameras instead of 1G can prevent micro bursts from overwhelming 1G uplinks. But pretending 1G speed to an end system and using traffic shaping to send the data more slowly, as done in some WAN optimizers, just does not work. The WAN link might be at 100% utilisation, but goodput tends towards zero.
I would recommend to not use explicit traffic shaping, only policing (limiting) on ingress. Rate limiting can be useful to stop non-responsive (UDP) flows from drowning out other traffic. It is a safety measure if strict priority queuing is used. Rate limiting flooded traffic can safe the network in case of a layer 2 loop. But an enterprise LAN usually does not need any QoS setup to work fine, even for voice. VoIP even works fine over the Internet, with no QoS at all, see Skype or generic SIP telephony providers.
Traditional AQM (e.g. WRED on EXOS) might work, but usually needs very fine tuning to actually help, and might make the performance problems worse. Modern self-tuning AQM methods like CoDel and PIE have not made it into switch ASICs yet, but might be found in WAN routers, where they help the most.
Practical use of rate limiting:
- prevent priority traffic from starving out everything else
- signal TCP that it uses the available bandwidth
- prevent flooded traffic from saturating a network during a layer 2 loop condition
Practical use of traffic shaping:
Practical use of packet pacing (e.g. 100M access port instead of 1G; or done by sender):
- prevent "micro burst" problems, e.g. for video
- mitigate "TCP incast" problems, e.g. with cluster applications
I hope this post helps a little to understand QoS, rate limiting, and traffic shaping in the real world.
Regards,
Erik