MLAG two tier design, lacp config for win 2013 servers


We are now testing following MLAG two Tier Design

2 BD8810 connected together with two 10Gbit LACP LAG

2 X670 connected to the BD's with 2x10Gb LACP LAG, each X670 is connected with one of Link to BD1 and the other to BD2

BD1 ---ISC-- BD2 --> VRRP Interface for Servers, Vlan is configured on the ISL Link.

670-1 ----ISC---670-2 --> Mlag for LACP Server Connection

On the X670 are the MLAG ports for the Servers configured.

We have configured on X440 as Server Dummy to check the functionality of MLAG. This works fine with no Interrupts.

Now we are testing the same constallation with MS 2012 R2 Servers, and we've got a lot of ICMP losses, the connection to the Server interrupts in the case e.g. opening a mstsc Session or just moving the mouse.

All LACP LAGs are configured L3_L4, the Aggregation to the Server comes up, MLAG Status is up, but as soon as traffic to the Server apears, the loose Connection to the Servers.

No Events in the Logs, MLAG or LACP is always fine

Any ideas?

18 replies

Userlevel 2
on the server what kind of LAG have you configured ? LACP ?
Yes, on the Server was configured LACP
Userlevel 2
You have enabled LACP on the port of the server on the x670

example

enable sharing 4 grouping 4 algorithm address-based L3 lacp

On MS try only L3 and not L3_L4 seems working well
Yes, on the two x670 Switches are the mlag formig Ports configured with LACP L3_L4

i will try lacp L3, hope it gets better
Userlevel 6
Try to look at the # sh lacp counters
it works a lot better with the port sharing config l3 lacp then l3_l4 lacp
To be continued....

We have always the Problem of packet loss in this configuration.

The win 2012 R2 Servers have acitvated LACP and load balancing dynamic.

I can do any Lacp config L2, L3 port based and address based, the icmp loss will not disapear

Has anyone a similar configuration with a two Tier mlag and win hyperv Servers and can give me advice
Userlevel 4
Is the Windows server using Microsoft NLB to create the LAG on the server end? If so, is it set up to use multicast?

-Andrew
Userlevel 2
Please share your config on both x670 and MS windows.
bb¥">Please share your config on both x670 and MS windows.
On the Windows Server:

two 10 Gbit Nic

LACP, dynamic Load balancing

the x670

* TEST_X670-2.3 # sh conf vsm
#
# Module vsm configuration.
#
configure mlag ports convergence-control fast
create mlag peer "switch1"
configure mlag peer "switch1" ipaddress 1.1.1.1 vr VR-Default
configure mlag peer "switch1" lacp-mac aa🇧🇧cc:dd:ee:ee
enable mlag port 1 peer "switch1" id 1
enable mlag port 2 peer "switch1" id 2
enable mlag port 3 peer "switch1" id 3
enable mlag port 4 peer "switch1" id 4
enable mlag port 5 peer "switch1" id 5
enable mlag port 6 peer "switch1" id 6
enable mlag port 7 peer "switch1" id 7
enable mlag port 8 peer "switch1" id 8
enable mlag port 9 peer "switch1" id 9
enable mlag port 10 peer "switch1" id 10
enable mlag port 11 peer "switch1" id 11
enable mlag port 12 peer "switch1" id 12
enable mlag port 17 peer "switch1" id 17
enable mlag port 18 peer "switch1" id 18
enable mlag port 19 peer "switch1" id 19
enable mlag port 20 peer "switch1" id 20
enable mlag port 21 peer "switch1" id 21
enable mlag port 22 peer "switch1" id 22
enable mlag port 23 peer "switch1" id 23
enable mlag port 24 peer "switch1" id 24
enable mlag port 25 peer "switch1" id 25
enable mlag port 26 peer "switch1" id 26
enable mlag port 27 peer "switch1" id 27
enable mlag port 28 peer "switch1" id 28
enable mlag port 40 peer "switch1" id 40

* TEST_X670-2.4 # sh ports sharing
Load Sharing Monitor
Config Current Agg Ld Share Ld Share Agg Link Link Up
Master Master Control Algorithm Group Mbr State Transitions
==============================================================================
1 1 LACP L3 1 Y A 1
2 LACP L3 2 - A 1
3 LACP L3 3 - A 8
4 LACP L3 4 - A 3
5 5 LACP L3 5 Y A 5
6 6 LACP L3 6 Y A 6
7 LACP L3 7 - A 1
8 LACP L3 8 - A 7
9 LACP L3 9 - A 1
10 LACP L3 10 - A 1
11 LACP L3 11 - A 4
12 LACP L3 12 - A 1
17 LACP L3 17 - R 0
18 LACP L3 18 - R 0
19 LACP L3 19 - R 0
20 LACP L3 20 - R 0
21 LACP L3 21 - R 0
22 LACP L3 22 - R 0
23 LACP L3 23 - R 0
24 LACP L3 24 - R 0
25 LACP L3 25 - R 0
26 LACP L3 26 - R 0
27 LACP L3 27 - R 0
28 LACP L3 28 - R 0
40 40 LACP L3_L4 40 Y A 1
41 41 LACP L3_L4 41 Y A 1
L3_L4 42 Y A 1
L3_L4 43 Y A 1
L3_L4 44 Y A 1
47 47 LACP L3_L4 47 Y A 4
L3_L4 48 Y A 4
==============================================================================
Link State: A-Active, D-Disabled, R-Ready, NP-Port not present, L-Loopback
Load Sharing Algorithm: (L2) Layer 2 address based, (L3) Layer 3 address based
(L3_L4) Layer 3 address and Layer 4 port based
(custom) User-selected address-based configuration
Custom Algorithm Configuration: ipv4 L3-and-L4, xor
Number of load sharing trunks: 27

* TEST_X670-2.5 # sh edp ports all

Port Neighbor Neighbor-ID Remote Age Num
Port Vlans
=============================================================================
40 TEST-ServerDummy 00:00:00:04:96:7d:f3:8d 1:2 54 1
41 TEST_BD8810-1 00:00:00:04:96:1d:69:10 1:2 34 6
42 TEST_BD8810-1 00:00:00:04:96:1d:69:10 2:2 17 6
43 TEST_BD8810-2 00:00:00:04:96:1d:db:e0 1:2 33 6
44 TEST_BD8810-2 00:00:00:04:96:1d:db:e0 2:2 57 6
47 TEST_X670-1 00:00:00:04:96:98:fd:06 1:47 5 1
48 TEST_X670-1 00:00:00:04:96:98:fd:06 1:48 28 1
=============================================================================
* TEST_X670-2.6 #

##########################################################################

* TEST_X670-1.1 # sh conf "vsm"
#
# Module vsm configuration.
#
configure mlag ports convergence-control fast
create mlag peer "switch2"
configure mlag peer "switch2" ipaddress 1.1.1.2 vr VR-Default
configure mlag peer "switch2" lacp-mac aa🇧🇧cc:dd:ee:ee
enable mlag port 1 peer "switch2" id 1
enable mlag port 2 peer "switch2" id 2
enable mlag port 3 peer "switch2" id 3
enable mlag port 4 peer "switch2" id 4
enable mlag port 5 peer "switch2" id 5
enable mlag port 6 peer "switch2" id 6
enable mlag port 7 peer "switch2" id 7
enable mlag port 8 peer "switch2" id 8
enable mlag port 9 peer "switch2" id 9
enable mlag port 10 peer "switch2" id 10
enable mlag port 11 peer "switch2" id 11
enable mlag port 12 peer "switch2" id 12
enable mlag port 17 peer "switch2" id 17
enable mlag port 18 peer "switch2" id 18
enable mlag port 19 peer "switch2" id 19
enable mlag port 20 peer "switch2" id 20
enable mlag port 21 peer "switch2" id 21
enable mlag port 22 peer "switch2" id 22
enable mlag port 23 peer "switch2" id 23
enable mlag port 24 peer "switch2" id 24
enable mlag port 25 peer "switch2" id 25
enable mlag port 26 peer "switch2" id 26
enable mlag port 27 peer "switch2" id 27
enable mlag port 28 peer "switch2" id 28
enable mlag port 40 peer "switch2" id 40
* TEST_X670-1.2 # sh ports sharing
Load Sharing Monitor
Config Current Agg Ld Share Ld Share Agg Link Link Up
Master Master Control Algorithm Group Mbr State Transitions
==============================================================================
1 1 LACP L3 1 Y A 1
2 LACP L3 2 - A 1
3 LACP L3 3 - A 2
4 LACP L3 4 - A 7
5 5 LACP L3 5 Y A 3
6 6 LACP L3 6 Y A 3
7 LACP L3 7 - A 1
8 LACP L3 8 - A 7
9 LACP L3 9 - A 1
10 LACP L3 10 - A 1
11 LACP L3 11 - A 1
12 LACP L3 12 - A 5
17 LACP L3 17 - R 0
18 LACP L3 18 - R 0
19 LACP L3 19 - R 0
20 LACP L3 20 - R 0
21 LACP L3 21 - R 0
22 LACP L3 22 - R 0
23 LACP L3 23 - R 0
24 LACP L3 24 - R 0
25 LACP L3 25 - R 0
26 LACP L3 26 - R 0
27 LACP L3 27 - R 0
28 LACP L3 28 - R 0
40 40 LACP L3_L4 40 Y A 1
41 41 LACP L3_L4 41 Y A 1
L3_L4 42 Y A 1
L3_L4 43 Y A 1
L3_L4 44 Y A 1
47 47 LACP L3_L4 47 Y A 2
L3_L4 48 Y A 2
==============================================================================
Link State: A-Active, D-Disabled, R-Ready, NP-Port not present, L-Loopback
Load Sharing Algorithm: (L2) Layer 2 address based, (L3) Layer 3 address based
(L3_L4) Layer 3 address and Layer 4 port based
(custom) User-selected address-based configuration
Custom Algorithm Configuration: ipv4 L3-and-L4, xor
Number of load sharing trunks: 27
* TEST_X670-1.3 # sh edp ports all

Port Neighbor Neighbor-ID Remote Age Num
Port Vlans
=============================================================================
40 TEST-ServerDummy 00:00:00:04:96:7d:f3:8d 1:1 1 1
41 TEST_BD8810-1 00:00:00:04:96:1d:69:10 1:1 17 6
42 TEST_BD8810-1 00:00:00:04:96:1d:69:10 2:1 33 6
43 TEST_BD8810-2 00:00:00:04:96:1d:db:e0 1:1 17 6
44 TEST_BD8810-2 00:00:00:04:96:1d:db:e0 2:1 55 6
47 TEST_X670-2 00:00:00:04:96:98:fc:9c 1:47 15 1
48 TEST_X670-2 00:00:00:04:96:98:fc:9c 1:48 54 1
=============================================================================
* TEST_X670-1.4 #
Hi Try to use the two x670s as stacking over 10G then this stack system to Mlag to BDs So nod Tier Mlag config
Userlevel 5
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Hi Bill

Now the Switches are running the recommended FW 15.6.3.1, i've tested it also with 15.7.1.4 patch 1-1.

With 16.1.1. i wasn't satisfied, cause we couldnt authorize with our radius Login, but this is another Special Problem....

Switch primary Fri Jul 3 12:54:38 Unknown 2015 15.6.3.1 summitX-15.6.3.1.xos v1563b1
Switch primary Fri Jul 3 12:55:54 Unknown 2015 15.6.3.1 summitX-15.6.3.1-ssh.xmod v1563b1
Switch secondary Mon May 4 10:08:40 UTC 2015 15.7.1.4 summitX-15.7.1.4-patch1-1.xos v1571b4-patch1-1
Switch secondary Mon May 4 10:09:17 UTC 2015 15.7.1.4 summitX-15.7.1.4-ssh.xmod v1571b4

The Server guys told me, that they are already using the latest Drivers for their brocade NICs
Userlevel 6
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Hi Harry,

As a part of the troubleshooting, please try to disable one of the NIC on the server and then check if the issue still persist. If the issue is gone, please re-enable the NIC and then configure mac-tracking on the Extreme switches to observe any MAC movement.

Please refer the article below for the steps to configure mac-tracking.

https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-configure-MAC-tracking-in-EXOS/?q=mac-tracking&l=en_US&fs=Search&pn=1

After configuring the mac-tracking, check the logs and look for any mac-movement.
As Bill pointed out, the switch configuration looks fine and if you use an X440, the traffic flow is good.
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Hi Bill and Prashant!

here the Driver Dates from the Server NIC



Name : NIC1

InterfaceDescription : Broadcom BCM57800 NetXtreme II 10 GigE (NDIS VBD

Client) #46

InterfaceIndex : 13

MacAddress : B0-83-FE-E5-D6-23

MediaType : 802.3

PhysicalMediaType : 802.3

InterfaceOperationalStatus : Up

AdminStatus : Up

LinkSpeed(Gbps) : 10

MediaConnectionState : Connected

ConnectorPresent : True

VlanID : 0



DriverInformation : Driver Date 2013-06-11 Version 7.4.23.2 NDIS 6.30



Userlevel 5
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
That configuration looks correct as per the team, but I suspect there is a new Broadcom drivers for the NIC.. I looked at the Broadcom site there are versions 17 (dated 1-26-15) versions out there... can you please confirm them? About 90% of the time these are the issue...
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Hi Bill!

We stepped back to a Server independend config an it works fine cause we are a Little bit in a hurry to bring this Design to a productive Status.

I try to build this up again later in this year.

Thank you very much for your helpful support!
Userlevel 5
Looks like a configuration on the Windows Servers, need to know the adapters you have and how the lag is configured. Make sure that you update those drivers to the latest available. Your configuration on the switches looks fine. What is the version of XOS your are using? Make sure that you are trying 15.5 and later, I would suggest 16.1.1.4. Overall I have seen this issue with outdated windows drivers... Let me know and we can figure it out...

Bill
Sounds good... Please let us know when you are doing this again so we can share your experience with others. Let us know if we help, thats what we are here for!

Bill

Reply