Load balancing X670 G2

  • 0
  • 1
  • Question
  • Updated 8 months ago
  • Answered
Hello Every one!
I have 2 X670 Extreme switches. I made an ether channel with lacp method for 2(or more) links. In order to load balancing I have chosen CRC-32 and expected to see all traffics of a flow to be mapped to the same ports inside the both switches. For example imagine that I have connected port 2 of SW1 to port 2(also port 3 to port 3 and more in the same way) of SW2 and configured src-ip on SW1 as an input to crc-32 load balancing algorithm and dst-ip SW2 as an input to crc-32 load balancing algorithm. As you may find out requests in a flow will go through Sw1 to Sw2 and responses will come back through SW2 to SW1. I expected all traffics off a flow go to and come back from the same port but the results aren't equal to my expectations. Some packets of a flow will mapped to different ports of LAG. Is there any point or knowledge that I didn't consider?
Photo of Amir A.N

Amir A.N

  • 176 Points 100 badge 2x thumb

Posted 9 months ago

  • 0
  • 1
Photo of Nicolas Dreher

Nicolas Dreher

  • 120 Points 100 badge 2x thumb

Hello Amir,


As long as I know, we are not able to choose src or dst IP address as an input to the algorithm. But I'm not an expert. The answer may hide behind the custom parameter herunder.

enable sharing <master-port> grouping <port-range> algorithm address-based <L2 | L3 | L3_L4| custom> lacp


Anyway, the same flow (I mean the couple src-dst IP) should go the same way in both directions as long as both adresses are taken into account.

Have you activated round-robin as an outputalgorithm ?

Can you post an extract of you configuration here ?

(Edited)
Photo of Amir A.N

Amir A.N

  • 176 Points 100 badge 2x thumb
Hey Nicolas!
Thanks for your response. The configuration for load balancing is as below. I don't need to load balance per packet. It should be per flow. Requests in a Flow(A to B) are going through SW1 and responses in that flow(B to A) come back via SW2

SW1:
configure sharing address-based custom ipv4 source-only
configure sharing address-based custom hash-algorithm crc-32 lower
enable sharing 2 grouping 2-3 algorithm address-based custom lacp

SW2:
configure sharing address-based custom ipv4 destination-only
configure sharing address-based custom hash-algorithm crc-32 lower
enable sharing 2 grouping 2-3 algorithm address-based custom lacp
(Edited)
Photo of Erik Auerswald

Erik Auerswald, Embassador

  • 12,886 Points 10k badge 2x thumb
Hi Amir,

the load balancing is supposed to be per flow, and in my experience it does work that way. Thus I am astonished that you write:
Some packets of a flow will mapped to different ports of LAG.
How did you measure this?


The documentation contains this bit of information:
Note
In a switch having at least one LAG group with custom algorithm, the egress port for
unknown unicast packets across all LAG groups in switch will be decided based on Layer 3
source and destination IP address.
That would explain why some packets of a flow may use a different port, i.e. they are unknown unicast frames. Then the destination MAC address is learned and the configured load sharing algorithm is used.

Regarding the use of a different port on SW1 than on SW2 although the configuration seems to imply that the same port is used in both directions: Some switches add a seed value to the hash to avoid the ECMP / hash polarization issue, see e.g. Uneven load sharing of traffic being forwarded through several subsequent load sharing groupsTraffic is not being shared across ECMP routes or Not all the ECMP routes are being used for traffic forwarding. I do not know if this is the case for the X670-G2 or not, but this kind of mechanism would result in different ports used although all obvious input values are identical.

Thanks,
Erik
Photo of Amir A.N

Amir A.N

  • 176 Points 100 badge 2x thumb
Thank you Erik.
I have mentioned that the load balancing method is based on source address only(and destination address only on the other switch) so I didn't get your points when say "That would explain why some packets of a flow may use a different port, i.e. they are unknown unicast frames. Then the destination MAC address is learned and the configured load sharing algorithm is used." because all packets of a flow have same source address when they are going out and when they are coming back.
Also I didn't route packets in switches so I don't need ECMP. I only need the switches map the flow(all packets) to same port not different ones for some of them. The reason for this kind of usage is firewall. I have 2 firewalls between the mentioned 2 switches and firewalls will get to trouble when they see some packets of a flow that didn't see before. Thank you again for your answers.
By the way I have disabled mac address learning so all packets are unknown.
(Edited)
Photo of Erik Auerswald

Erik Auerswald, Embassador

  • 12,886 Points 10k badge 2x thumb
Hi Amir,

please take a look at the second quote in my post above, wich contains the statement: the egress port for unknown unicast packets [...] based on Layer 3 source and destination IP address.

Because you are using the custom algorithm for a LAG on the switch, unknown unicast packets are always load shared in a LAG based on both source and destination IP address instead of what you configured for the LAG. Thus some packets, i.e. the unknown unicast frames, can be sent on a different port than the known unicast frames. This is an explanation why it is possible for a layer 3 switch to break some packets out of a flow.

Anyway, a firewall between the two switches should be configured to understand that both physical links are one logical link. Thus any flow that is seen on one of the physical links must be treated as using the logical link. Otherwise your setup might work for a specific hardware/software combination of switches, but that may change with a software upgrade, or when replacing the hardware with a newer (or just different) model, or when changing vendors. If you were to change to an MLAG setup (replacing each switch by an MLAG pair), the packet would always egress the local port. If you were to use a stack or virtual switch bond (or virtual chassis or virtual switching system as other vendors call it), you may or may not be able to configure if a local member port is used or not, thus you may not be able to create the same behavior as with the MLAG.

Long story short, placing some device that does not understand LAG into the middle of a LAG (actually, breaking up the LAG into two parts, both comprised of one LAG aware device and a LAG unaware device) is a bad idea.

Thanks,
Erik
(Edited)