ā06-08-2020 10:45 AM
We are using an SPBM-Cloud of 4 VSP 8600 as our Backbone. Two of them are connected to a layer 2 transport net in which the firewall is used as default gateway. last week we had a power shortage and a few weeks ago i rebooted the firewall at night. both times the VSP stopped using the firewall as gateway. clients that tried to ping something behind the firewall got an ātime to life exceededā error. The VSPs itself were able to ping devices behind the firewall.
By using different VRFs on the VSPs we are creating different security domains. all other VRFs didnāt suffer from that problem although they get routed by the same firewall, albeit another IP.
the solution to the problem was deleting the route and recreating it.
this is the route we are using:
ip route 0.0.0.0 0.0.0.0 172.28.2.1 weight 1 preference 5
show ip route
************************************************************************************
Command Execution Time: Mon Jun 08 12:42:44 2020 CEST
************************************************************************************
=====================================================================================================
IP Route - GlobalRouter
=====================================================================================================
NH INTER
DST MASK NEXT VRF/ISID COST FACE PROT AGE TYPE PRF
-----------------------------------------------------------------------------------------------------
0.0.0.0 0.0.0.0 172.28.2.1 GlobalRouter 1 135 STAT 0 IB 5
this is the routing table on one of the VSPs that is not directly connected to the firewall:
show ip route
************************************************************************************
Command Execution Time: Mon Jun 08 12:40:27 2020 CEST
************************************************************************************
=====================================================================================================
IP Route - GlobalRouter
=====================================================================================================
NH INTER
DST MASK NEXT VRF/ISID COST FACE PROT AGE TYPE PRF
-----------------------------------------------------------------------------------------------------
0.0.0.0 0.0.0.0 pik GlobalRouter 10 4051 ISIS 0 IBSE 7
0.0.0.0 0.0.0.0 kreuz GlobalRouter 10 4051 ISIS 0 IBSE 7
0.0.0.0 0.0.0.0 pik GlobalRouter 10 4052 ISIS 0 IBSE 7
0.0.0.0 0.0.0.0 kreuz GlobalRouter 10 4052 ISIS 0 IBSE 7
What could possibly be the reason for this strange behavior?
ā11-10-2020 01:59 PM
isis redistribution is enabled. both connected VSP have the firewall as default router and have rsmlt enabled. the other 2 VSP know of the default route via route redistribution.
since after the fw-uplink drops for 2 seconds the routing doesnāt work anymore for connected clients, but the VSP are still able to use the default gateway to reach systems behind the fw our partner guesses, that this could be a bug in the hardware routing table.
since the cli get routed in software and is able to use the gw, but the hardware routing table used by connected clients is not, this would be fitting. recreating the default route or resetting the gigabitethernet interfaces fixes the problem, which also causes an update of the hardware routing table.
as far as i remember the error appeared after we updated to version 6.3.4. at the moment we are using 6.3.5 with the same problems.
EDIT: btw. this is my personal account, im the OP.
ā10-16-2020 04:49 PM
Is the firewall pointing to 172.28.2.2 as its next hop for the networks in question?
Shot in the dark, do the routes match on all four VSP8600s? Going to need to do ISIS redistribution of static and direct routes between the four of them.
ā06-09-2020 02:15 PM
thx for the info, ill keep you updated. btw. the problem was the same with version 6.3.3.0, after the first occurrence i updated the systems in the hopes this will solve the problem, which it unfortunately didnāt.
ā06-09-2020 12:18 AM
I donāt have an experience with 6.3.4.0. Running 6.3.3.0 everywhere and I havenāt seen such a behavior. Configuration looks fine from what youāve showed.
I would recommend you to open a case. It looks like the issue is pretty easy to reproduce so it should not be a problem for Extreme to look onto it. I am interested in further information about this issue as we are using similar setup. Would you please keep us updated?