Header Only - DO NOT REMOVE - Extreme Networks

different BGP session behavior 12.6->15.3?

Userlevel 4
Create Date: Feb 28 2013 8:38PM

Hi all,

I recently tried upgrading one of our core X480 routers from 12.6.3-patch1-8 to 15.3.1 as we're hoping to turn up IPv6 BGP soon. The switch is peered with another X480 inside our AS, running 12.6, and an ISP router, which is a Juniper. The IBGP session came up just fine, but the EBGP session with the Juniper came up, exchanged routes, and was shut down from the Juniper side, with an "optional attributes error." The session gets reset, and eventually the cycle repeats. Traffic flows, but the resets cause our routes to flap through that ISP, causing connectivity issues. The ISP has opened a ticket with Juniper to see what might be causing this, but has anyone seen behavior like this in moving from 12.6 to 15.3? Is there a different set of standard behavior with BGP sessions in 15.3?

Anonymized config follows. Route policy "ISPOut" referenced below restricts our exported routes to our NLRI, matching exactly.

enable bgp address-family ipv4-unicast advertise-inactive-route
configure bgp AS-number 65000
configure bgp routerid
configure bgp maximum-paths 4
enable bgp community format AS-number:number
configure bgp restart aware-only
configure bgp add network
create bgp neighbor remote-AS-number 65001
configure bgp neighbor source-interface ipaddress
configure bgp neighbor password encrypted "blahblahblah"
configure bgp neighbor description "ISP BGP Peer"
create bgp neighbor remote-AS-number 65000
configure bgp neighbor source-interface ipaddress
configure bgp neighbor password encrypted "blahblahblah"
enable bgp neighbor
configure bgp neighbor next-hop-self
configure bgp neighbor route-policy out ISPOut
disable bgp neighbor capability ipv4-multicast
configure bgp neighbor send-community standard
disable bgp neighbor capability ipv4-multicast
(from Ansley_Barnes)

16 replies

Userlevel 4
Create Date: Mar 1 2013 10:14PM

Hello Ansley

I have not seen this behavior. I would recommend opening a case with Extreme TAC as well so we can try and reproduce.

P (from Paul_Russo)
Userlevel 4
Create Date: Mar 3 2013 3:45AM

Ansley, per the RFC, if you receive a malformed BGP update with (in this case I believe it is an aggregator attribute with an AS value of 0), the session is supposed to reset. However, since an update might be sent constantly from an ISP, and as a workaround, a CLI command is introduced in 15.3.2 (or 15.2.3) to address this issue. Also, as a reference point, I recommend checking PD # PD4-3298430801 in 15.2.3 release notes.

In other words, I recommend downloading the 15.2.3 image and running the command configure bgp invalid-message-action drop-attribute aggregator-as-number-0 to address this issue. (from ethernet)
Userlevel 4
Create Date: Mar 6 2013 4:09PM

Many thanks! I've opened a ticket with TAC to investigate this, I'll look into ethernet's suggestion as well. We were hoping to keep the features in 15.3 - is there a timeframe on the release of 15.3.2, or a way to get this config into 15.3.1? Otherwise we can attempt a rollback to 15.2.3 until 15.3.x is patched. I appreciate the info. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 6 2013 4:10PM

It's been a difficult issue to troubleshoot because neither side of the session can see exactly what message(s) are causing the reset. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 6 2013 7:29PM

I'm still waiting on TAC (I don't have download access to images for 15.2.3 or 15.3.2) but I was wondering - is it possible to emulate this behavior with a policy applied to the routes received from the peer? something like if {as-number 0;} then {deny;}? Would this (assuming it's possible) even prevent the reset from occurring? I'd be more experimental in my testing of solutions, but every time this issue recurs it disrupts connectivity to our AS from the outside, and I'd naturally like to avoid that. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 7 2013 2:20PM

Those 2 software releases are not out yet. 15.2.3 is going to be out soon though from what I heard.

The ACL option for AS in Aggregator attribute does not exist in EXOS. But it would address this issue for sure.

I think 15.1.4 is out and contains the fix, however. (from ethernet)
Userlevel 4
Create Date: Mar 7 2013 4:26PM

Good to know. I'm still waiting on TAC, oddly, but does have the same fix? It appears to in the release notes. I'm hoping TAC will have some insight too, I'm just trying to line up all my options here. I appreciate the info. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 9 2013 2:31AM

I applied, as I checked the release notes and it had the fix for # PD4-3298430801. I ran the command, and restarted the BGP session, but still am getting the Optional Attribute error. We've temporarily changed our NLRI filter to distribute no routes, so we have a bit more flexibility in testing solutions.

I'm really scratching my head here though - was there any other behavior changed for ipv4-unicast BGP sessions in 15.x? (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 9 2013 9:31AM

I was told that was a typo in the release notes in patch 1-6. you have to wait for 15.2.3 to apply the configuration change effectively. (from ethernet)
Userlevel 4
Create Date: Mar 11 2013 12:06AM

Interesting - are you sure? It did accept the command to ignore the AS 0 with aggregate set properly, and it seems odd that the command would be added without the fix. I will try a further rollback to the confirmed fixed version tomorrow. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 11 2013 1:30PM

15.2.3 is released now. You can try it. Let us know how it goes. (from ethernet)
Userlevel 4
Create Date: Mar 11 2013 1:42PM

I just went to download it, actually, but there's no SSH image download link (policy prevents us from managing via unencrypted protocols) and the release notes link points, inexplicably, to the release notes for 12.3.1... I'm installing 15.1.4 right now to test the fix. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 11 2013 4:08PM

Looks like 15.1.4 might have have done the trick - the session is back up, and hasn't reset throughout the entire process of downloading and installing the IPv4 route table. Once 15.2.3 is fully up with the SSH module, or 15.3.2 is out, I'll move to one of those so that we have both the fix and the features.

I'm asking our ISP to remove their inbound route filters they installed to allow us to test fixes, and I'm removing my own, and will bring the session fully up after hours to see if it's truly fixed.

You guys have been a lifesaver - our Extreme support contract didn't get renewed for some reason (we're working with our vendor to get it renewed now) so I was flying blind. I really appreciate the information and advice. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 12 2013 1:11PM

Looks like that was the fix. Thanks again, everyone, for your help. (from Ansley_Barnes)
Userlevel 4
Create Date: Mar 13 2013 8:02PM

So it appears the issue was indeed the AS 0 with aggregate. I did do some searching and found the proposed draft change to the RFC:

I'm curious as to why Extreme made this the default behavior before it made it into the RFC. There may have been a good reason, I'm just curious, since it obviously caused problems (hence the release of the patches allowing the behavior to be turned off.) (from Ansley_Barnes)
Userlevel 4
Create Date: Apr 16 2013 8:33AM

I repported the same problem around August last year. Back then I recorded the BGP updates and found the exact update causing the problem. You can filter on BGP UPDATE specific parameters in wireshark, pretty useful in this case. Back then it started happening occasionally that Telecom Italia would announce a route with Aggregator AS0.

RFC4271 (BGP-4 http://www.ietf.org/rfc/rfc4271.txt) states that the only way to notify about an update error is to reset the BGP session.
It seems every other vendor has implement the mentioned draft long ago. I'm also curious why it would take Extreme so long? (from Kenneth_Oestrup)