cancel
Showing results for 
Search instead for 
Did you mean: 

Preparing for BGP neighbor outage

Preparing for BGP neighbor outage

Frank
Contributor II
OK, so let's say I'm BGP multihomed with multiple providers, using two routers (480s), and that I have my own ASN (12345). My BGP is happily trucking away, and I'm advertising my networks to all my peers.

Now provider X tells me that they'll have to do maintenance on my circuit. If that BGP peer drops, I know that I'll still have Internet access, but I also know that I will have a 3-4 minute window where BGP re-shuffles routes, and everything that used to come in or go out through provider X drops connectivity - essentially a short service outage, and there are people out there that (a) notice, and (b) aren't too happy when that happens.

So how do I best prepare for that? In the past (cisco gear), I've pre-pended my advertised ASN path to neighbor X with "10 more of 12345". Essentially, that keeps existing connections alive (I hope), but within 10 minutes or so, nobody should use that peer for incoming traffic anymore.

Is there a better way other than AS-prepend? (I don't think anyone implemented RFC 6198 yet)

I already have a policy in place for adverts out:
configure bgp neighbor 1.2.3.4 route-policy out AS-Localonlyto ensure that I only advertise locally originating paths. I'm thinking I could use that for prepends like this:

AS-Prepend.pol:entry prepend-localonly {
if {
as-path "^$"
} then {
as-path "29765 29765 29765 29765 29765 29765 29765 29765 29765 29765";
permit;
}
}
entry DenyRest {
if {
} then { deny; }
}
Would this be a working policy? And I could just activate it with

configure bgp neighbor 1.2.3.4 route-policy out AS-Prepend(possibly after an unconfigure bgp neighbor 1.2.3.4 route-policy out AS-Localonly, or whatever the proper syntax for that is)

Would that work? Is there a better way to do this? Again, the goal is to not have disruption due to BGP route convergence when one peer drops, because I'm shuffling traffic away before the drop.

Thanks!

P.S.: Bonus points - how do I script that (or whatever alternative), if I even can? If I know that the window is from 1am to 3am, I could automatically do the "config bgp neigh..." thing at 12:30am, and re-set it at 4:00am and never lose any sleep 🙂
9 REPLIES 9

Frank
Contributor II
Paul,

Could you elaborate on the "clean shutdown"? Would a "disable bgp neighbor X" do that without incurring the 3-minute problem? Or is there a better command (sequence) to do that?

How would I announce zero routes to a specific neighbor? "Deny everything" on an outbound bgp policy?

We're only announcing a handful of blocks. About 6-8 IPv4 blocks and one IPv6 block.
Yes, we're getting "full routes plus default" from our upstream providers, but limit to 500K routes as not to overload the routing table. The 480s were advertised to us as "datacenter ingress/egress routers" to replace Cisco 7606s

We have 3-4 providers between two routers in our datacenters, so fortunately when one does maintenance, we can still afford another one to fail without warning. "Backhoe mating season" might still catch us, though 😉

Thanks!

Frank

Paul_Thornton
New Contributor III
Bear in mind that whether you prepend once, or 20 times, another ISP can be using local-pref to select what you consider the 'bad' route. This is particularly the case for other customers of your direct transit provider - as if I am an ISP and you are my customer, I will always prefer the direct connection I'm selling you over a peer or other connection. Some ISPs allow you to use communities to perform traffic engineering to limit the way they propagate routes you're announcing to them, but that is getting a bit complex for your use case I think.

The way we tend to do this is either: (a) configure the peering session to announce zero routes, and accept no routes from the peer, around 30 minutes before any maintenance; or (b) simply shut down the session cleanly before any work starts.

Whatever you do, you are going to cause some recalculation - the important part is that when a BGP peering is closed gracefully (ie: by one end shutting it down), there are no issues with hold timers needing to expire and the routers at both ends will be able to send withdrawals of any routes learned via the peer immediately.

How many routes are you getting from your providers (and how many do you announce)? Is it a lot (I hope not a full table on an X480!) - If so, the use of 'don't announce anything / don't accept anything' policy before maintenance may work well. If you only have a default route, then cleanly closing the session should work ok.

I can see that prepending may seem a good idea, but I'm not sure that you'll actually benefit much more doing that than simply ensuring that the BGP neighbour is cleanly disabled before the maintenance.

One thing to be aware of - if you stop announcing prefixes to one of your upstreams in preparation for maintenance, of course you now have a single point of failure with the other one!

Paul.

Stephane_Grosj1
Extreme Employee
Hi,

from my experience, the average AS-Path length is 4-5. So adding 3 times your AS should have a strong enough "impact".

For local-pref, you have a knob in CLI to change it globally, and it requires to disable BGP first, right. However you can create policies to set a given local-pref per prefix.

Frank
Contributor II
That sounds as if my policy should do what I want it to do - thanks!

I'm using 15.3 in one location, and 15.5 in another - and I may have been exaggerating the prepends. Past experience on my Cisco routers (which we'll eventually phase out) has shown that 3-4 path prepends are enough to kill inbound traffic from that provider and force the world through the others. Interestingly enough that has also throttled outbound traffic through that neighbor, but I'm not clear why.

Changing local-pref doesn't seem to be much of an option, as I'd have to disable bgp to change it. It also seems that I can't put that preference in on a per-neighbor basis. I think if I need to mess with outbound paths (to move things away from a neighbor), I could probably prepend their BGP path similarly with an "as-path '123 123 123' " where "123" is that neighbor's ASN. I'd just do it as a "route-policy in".

Thanks for the answer!

Stephane_Grosj1
Extreme Employee
Hi,

you're adding a lot of times your own AS, hopefully not as much as what did crash the internet in the past years.

AS-Prepend will influence the traffic coming into your AS.
For the traffic going out of your AS, you want to modify local-pref.

Is that going to be applied to a full view (your main feed) or for some routes? 3mn convergence is for a full view failover.

You can easily script your CLI commands and have them launched by UPM timer.

What EXOS version do you use?
GTM-P2G8KFN