11-08-2019 08:47 PM
Hi
Currently have a strange networking issue, see the diagram below:
This shows two stacked EXOS switches in each data centre which are joined via MLAG.
There is an active / passive firewall. There exists 4 x /30 P2P links on the active firewall, 2 go to one stack the other 2 go to the other stack. All works well, ECMP configured and routing table reflects routes correctly. The passive firewall has all its interfaces shutdown.
Problem happens when the firewalls are flipped. You see both P2P links go down on each core, and then the other links that go to the other firewall come up as it goes active.
You see all the new P2P links then form a full adjacency.
What essentially is happening is the /30 subnets on each P2P link moves from one core to the other.
The problem is that all the new links form a full adjacency, but you cannot ping the other end of each of the point to point links and traffic stops passing through the firewalls!?
Well in fact, it seems one random P2P link out of the 4 will work and the remaining others will not, sometime not at all. If you fail the firewalls back, sometimes all the links restored, sometimes the link that successfully moved will not fail back.
There is a workaround though. Whenever a link stops working (in whatever scenario), you can simply disable and then re-enable the ports on the switch and all starts working!?
All settings in OSPF both EXOS and firewall are default and match i.e. timers, P2P, etc
Both firewalls and switches where upgraded, made no difference.
Enabled graceful restart, still no difference.
Can’t make sense of issue, and what could be causing it?.
Ideas:
Even if one of those ideas was true I’m not sure what I can do about it, so hoping the community can help?
EXOS Version: 22.7.1.2 patch1-11
Palo Alto Version: 8.1.11
Solved! Go to Solution.
11-14-2019 09:25 PM
This is now fixed.
Seems I have been looking at the issue in reverse…….
The MAC addresses of each the ports being presented to the Palo Alto are based on the Extreme switch MAC address and are as follows:
Col-CEF-Core1
02:04:96:9F:94:D8
Col-2A22-Core2
02:04:96:9F:A4:74
To view the ARP table of each of the P2P ports on the Palo Alto’s you can use the following commands:
show arp ethernet1/5
show arp ethernet1/6
show arp ethernet1/7
show arp ethernet1/8
This is the state of the ARP table on the active firewall before failover:
Firewall A
interface ip address hw address port status ttl
--------------------------------------------------------------------------------
ethernet1/5 172.20.251.66 02:04:96:9f:94:d8 ethernet1/5 c 1535
ethernet1/6 172.20.251.70 02:04:96:9f:94:d8 ethernet1/6 c 1528
ethernet1/7 172.20.251.74 02:04:96:9f:a4:74 ethernet1/7 c 1525
ethernet1/8 172.20.251.78 02:04:96:9f:a4:74 ethernet1/8 c 1526
This is the state of the ARP table after failover:
Firewall B
interface ip address hw address port status ttl
--------------------------------------------------------------------------------
ethernet1/5 172.20.251.66 02:04:96:9f:94:d8 ethernet1/5 c 1245
ethernet1/6 172.20.251.70 02:04:96:9f:94:d8 ethernet1/6 c 1245
ethernet1/7 172.20.251.74 02:04:96:9f:a4:74 ethernet1/7 c 1245
ethernet1/8 172.20.251.78 02:04:96:9f:a4:74 ethernet1/8 c 1250
The issue here is that the P2P ports move to the other switches when the passive firewall becomes active, that means the MAC addresses should have swapped around but they have not!
When you issue the ‘clear arp all’ command on the Palo Alto, this then refreshes the ARP entries to the now correct order and all works, see below:
Clear ARP All on active firewall:
ethernet1/5 172.20.251.66 02:04:96:9f:a4:74 ethernet1/5 c 1782
ethernet1/6 172.20.251.70 02:04:96:9f:a4:74 ethernet1/6 c 1782
ethernet1/7 172.20.251.74 02:04:96:9f:94:d8 ethernet1/7 c 1777
ethernet1/8 172.20.251.78 02:04:96:9f:94:d8 ethernet1/8 c 1735
The answer was to move the cables around on the passive firewall so that the same P2P subnet become active on the same switch so the MAC presented to the Palo stayed the same!
This may well be what the proper method is, or some alternative configuration may have helped but this sorted the issue for me.
Below is a diagram showing connections previously in the top row and then what I had moved them too below that in red:
11-08-2019 09:34 PM
Hi David,
Thanks for posting.
Just checked and the MAC address seems to be moving OK:
Firewall A Active
Slot-1 Col-CEF-Core1.1 # show fdb ports 1:42-43,2:42-43
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
------------------------------------------------------------------------------------------------------
b4:0c:25:e2:c0:44 FW1-Link1-Core1(3501) 0000 d mi 1:42
b4:0c:25:e2:c0:45 FW1-Link2-Core1(3502) 0000 d mi 2:42
* Slot-1 Col-2A22-Core2.1 # show fdb ports 1:42-43,2:42-43
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
------------------------------------------------------------------------------------------------------
b4:0c:25:e2:c0:46 FW1-Link3-Core2(3503) 0000 d mi 1:43
b4:0c:25:e2:c0:47 FW1-Link4-Core2(3504) 0000 d mi 2:43
Firewall B Active
Slot-1 Col-CEF-Core1.2 # show fdb ports 1:42-43,2:42-43
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
------------------------------------------------------------------------------------------------------
b4:0c:25:e2:c0:46 FW2-Link3-Core1(3503) 0000 d mi 1:43
b4:0c:25:e2:c0:47 FW2-Link4-Core1(3504) 0000 d mi 2:43
* Slot-1 Col-2A22-Core2.2 # show fdb ports 1:42-43,2:42-43
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
------------------------------------------------------------------------------------------------------
b4:0c:25:e2:c0:44 FW2-Link1-Core2(3501) 0000 d mi 1:42
b4:0c:25:e2:c0:45 FW2-Link2-Core2(3502) 0000 d mi 2:42
Checked on core 1 that the previous MAC addresses did not get stuck but are not showing:
Slot-1 Col-CEF-Core1.3 # show fdb b4:0c:25:e2:c0:44
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
------------------------------------------------------------------------------------------------------
Slot-1 Col-CEF-Core1.3 # show fdb b4:0c:25:e2:c0:45
MAC VLAN Name( Tag) Age Flags Port / Virtual Port List
Thanks
Martin
11-08-2019 09:19 PM
Is the arp entry getting stuck with the old wrong MAC?