Flooding Unicast Traffic due to Load-sharing NICs

  • 0
  • 1
  • Article
  • Updated 4 years ago
  • (Edited)
Article ID: 2839

Protocols/Features
NIC Teaming

Symptoms
Unicast packets flooding across network
Unicast flooding
Network is locked up
Can't pass traffic
All LEDs are solid
Broadcast storm
Network respans

Cause
Servers with multiple NIC cards can cause what appears to be "Unicast" flooding on your network.

What some servers will do with multiple NICs installed is to automatically make one of the NICs a 'receive only' and the other 'transmit only'. This will increase throughput but does not allow for the SAT (Source Address Table) in the switches/bridges to update. This SAT is used to forward traffic to the correct destination interface. If there is no entry in the SAT then the switch will flood the Unicast traffic out all interfaces.

There are also several server mirror programs that will do this. One server will become the receive and the other will become the transmit. Again this is good for throughput for the servers but not for updating the SAT for bridges/switches.

The flooding will occur if the backup or download takes more than 5 minutes. Enterasys's SAT time-out by default is 300 seconds. This is most noticeable for server to server backups or extremely large file transfers. When the transfer is initially set up between the servers or server and user, the receive NIC will answer the request and will reply to the end device -- updating the SAT tables. This allows the Unicast traffic to be switched properly through your network.

If you suspect a "flooding" problem you will need to take a trace and look at the destination address of the Unicast packet. If the destination is always the same and there is no reply from the end device on the trace then this issue with multiple NICs could be your problem.

Another way to see the SAT table for the bridge/switch is to use a MIB Browser (e.g. NetSight Atlas) to query the following MIB and see if the end device MAC address is there. The MIB to query is 1.3.6.1.2.1.17.4.3.1.1. This will give you the hex to decimal translation. 1.3.6.1.2.1.17.4.3.1.2 will give you the address to interface association. You will have to do this while the problem is occurring, not after, as another request could have been received by the receiving NIC.

Solution
If both the switch and the teamed NICs are LACP-compliant, enable 802.3ad Link Aggregation on both sides so that a a Dynamic LAG will form when the devices are attached. The resulting single logical "pipe" should not be subject to the issue as described in this document.

Workaround: Set up a background ping every 4 minutes to the external NIC of the receiving station/server. The external NIC, when it responds to the ping, will then update your switches/bridges SAT and stop this problem.

If you are considering changing the time-out on the switches, then please refer to Article 3460 "Changing Source Address Time-out on Switches and Bridges" first.

See also: 5322.
Photo of FAQ User

FAQ User, Official Rep

  • 13,620 Points 10k badge 2x thumb

Posted 4 years ago

  • 0
  • 1

There are no replies.

This conversation is no longer open for comments or replies.