Hardware route table full issues

  • 0
  • 1
  • Problem
  • Updated 2 years ago
  • Solved
I've just been troubleshooting a rather odd network latency problem.

Scenario is that we have two X480s doing BGP with some Internet transit providers and peers.  All fairly standard ISP setup - the transit providers give a default route, and the peers provide some more specifics - due to lack of room in the X480 for a full global IP routing table.

We were seeing latency jumps of over 100ms on all packets going through one of these switches earlier (which happened to have all of the BGP sessions for the peer routes on it).  This increase in latency went away when these peers were shut down and the peer routes removed from the routing table.  No processes were maxing out the CPU at the top of top though, so it didn't look like a classic slow-path issue.

The X480 should be able to cope with 256K IPv4 routes but only 8K IPv6 routes.

The BGP feeds were providing, just before I closed the peers, a total of around 81000 IPv4 routes and 26000 IPv6 routes.

With the extra peer routes removed, the total size of the routing table on the switch dropped to around 450 routes and everything was (and still is) happy again.

The switch log was complaining about:
01/26/2016 17:35:23.05 <Info:HAL.IPv6FIB.LPMTblFull> IPv6 route not added to hardware. Hardware LPM Table full.
which makes some sense given that 26K > 8K.

So, to the key part of the question.  Assuming IPv6 traffic was very low (it was), should the overflow of the IPv6 hardware table have affected IPv4 forwarding?  It very much seemed to have done so in this instance; we were testing with IPv4 traffic to devices on the network and all suffered increased latency.

And as a followup, is there a way to re-carve this on an X480 to give more IPv6 hardware entries?


Photo of Paul Thornton

Paul Thornton

  • 1,324 Points 1k badge 2x thumb

Posted 2 years ago

  • 0
  • 1
Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 11,948 Points 10k badge 2x thumb

The x480 has several configuration modes for IPv4 and IPv6. Assuming you have a "recent" EXOS version (there were some modifications around 15.2 or 15.4 timeframe, can't remember precisely which one), the config forwarding external-tables CLI command is offering specific mode to help with IPv6.

the command show forwarding configuration can help you find out your current setup (default is l2-and-l3).

Depending on your needs, you might want to look at the enhanced IPv6 settings:

- l3-only ipv4-and-ipv6 gives 464k/48k for LPM (IPv4/IPv6)
- l3-only ipv6 gives 16k/240k

the other mode will allow only 8k for IPv6 LPM.

You can also turn on compression for IPv4 and IPv6 (two separate commands). It can help significantly.

Changing the forwarding configuration will require a reboot.
Photo of Paul Thornton

Paul Thornton

  • 1,324 Points 1k badge 2x thumb
Hi Stephane,

Ah.  That was the magic command I was trying to remember to look at this.  Thank you.

The switch is running - I think that comes under the 'new enough' heading :)

So we have:

inet1.1 # show forwarding configuration

L2 and L3 Forwarding table hash algorithm:
    Configured hash algorithm:              crc32
    Current hash algorithm:                 crc32

L3 Dual-Hash configuration:
    Configured setting:                     on
    Current setting:                        on
    Dual-Hash Recursion Level:              1

Hash criteria for IP unicast traffic for L2 load sharing and ECMP route sharing
    Sharing criteria:                       L3_L4

IP multicast:
    Group Table Compression:                on
    Local Network Forwarding:               slow-path
    Lookup-Key:                             (SourceIP, GroupIP, VlanId)

External lookup tables:
    Configured Setting:                     l2-and-l3
    Current Setting:                        l2-and-l3

Switch Settings:
    Switching mode:                         store-and-forward

L2 Protocol:
    Fast convergence:                       on

Fabric Flow Control:
    Fabric Flow Control:                    auto

And from what you're saying, I need to switch from l2-and-l3 to l3-only use; and I had totally forgotten about compression (assuming you're talking about 'enable iproute compression').

I can see that I need to do a:
config forwarding external-tables l3-only ipv4-and-ipv6

Is there any flexibility in that 464k/48k split for the V4/V6 routes - I was looking at 'config iproute reserved-entries ...' which looks like it may do what I want there.  Obviously, any split you do between v4 and v6 routes is a tradeoff as the number of routes of each is increasing daily on the Internet.

Quickly asking again about the cause of the issue earlier (I haven't opened a case on this yet, I thought I'd ask here to see if anyone had any ideas).  If the V6 table was full, should we expect to see degraded performance across the whole switch, including V4 forwarding?

Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 11,948 Points 10k badge 2x thumb
Yes, I was referring to iproute compression.

for IPv4 it's: enable iproute compression.
for IPv6 it's: enable iproute ipv6 compression

As for the balance of IPv4/IPv6, unfortunately no, there's no such flexibility, you have to pick one of the predefined settings. The l3-only ipv4-and-ipv6 setting has been a long discussion on the total amount of IPv4 that was necessary. With iproute compression, it should have enough room in the FIB for both IPv4 and IPv6 full view.

The config iproute reserved-entries will not help you in your case. It's more about allowing EXOS to use a part of the LPM for some clever optimization.

As for the performance, I'm not sure. It would require investigation.
Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 11,948 Points 10k badge 2x thumb
By logic, even if some IPv6 traffic is going slow-path, IPv4 entries in HW should have no impact. So, your experience would mean some IPv4 traffic was also going slow-path. Remember that one such reason could be some IPv4 traffic with IP Options in the header.

To check that kind of thing:

show iproute reserved-entries statistics

show ipstats | inc Forw

show ipstats ipv6 | inc Forw

... wait 10 seconds

show ipstats | inc Forw

show ipstats ipv6 | inc Forw

Photo of Paul Thornton

Paul Thornton

  • 1,324 Points 1k badge 2x thumb
Thanks for all that.  I'm very suspicious that the problem stopped when we dropped a lot of routes (all of the traffic coming via these sources would have just fallen over to a different link, and not gone away).

What we'll do is bring the additional routes back in before changing anything else, and see if I can provoke the problem, and then collect the various ipstats from the switches.  Once I have that, I'll update here and talk to the TAC if there's nothing obvious.

I'm not sure that opening a case right now would help anyway as there is nothing to go on :(

Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 11,948 Points 10k badge 2x thumb
Yes, it's usually difficult to find the reason of an issue when troubleshooting a (now) working network. :)
Photo of Paul Thornton

Paul Thornton

  • 1,324 Points 1k badge 2x thumb
Hi all,

A brief followup on this as I've been doing some lab testing today.

I have an X480 here, and I've reconfigured it to have l3-only ipv4-and-ipv6 forwarding configuration and ip route compression enabled.  All went as expected.

So, in the interests of science, I sent it a full Internet routing table:

* lab_inet1.63 # show iproute summary
=================ROUTE SUMMARY=================
Mask distribution:
       1 default routes               16 routes at length  8
      13 routes at length  9          35 routes at length 10
     101 routes at length 11         265 routes at length 12
     507 routes at length 13        1029 routes at length 14
    1764 routes at length 15       12944 routes at length 16
    7346 routes at length 17       12372 routes at length 18
   25189 routes at length 19       36581 routes at length 20
   38614 routes at length 21       63736 routes at length 22
   54674 routes at length 23      315613 routes at length 24
       1 routes at length 25           1 routes at length 26
      12 routes at length 27          20 routes at length 28
      38 routes at length 29          59 routes at length 30
       7 routes at length 31          44 routes at length 32

Route origin distribution:
570874 EBGP          2    Blackhole     31   Static    
75   Direct       

Total number of routes = 570982
Total number of compressed routes = 287247

My question here is do the last two lines mean it has 570K routes in total (it does, that's pretty much the size of the global routing table), of which 287K are compressed?

So am I actually using up 283K routes in the hardware table?  Or 570K?  Or somewhere in between?  Is there a way to find out?

I see this, and no log entries, so as it has more than 458K routes the hardware isn't overfilled:

lab_inet1.65 # show iproute reserved-entries
                        IPv4       # Reserved Routes            Minimum #
Slot  Type              Routes      IPv4   (or IPv6)            IPv4 Hosts
----  ----------------  --------  -------  ------------------   ----------
1     X480-24x(10G4X)   External   458720  (  ext.) [default]        16384

but that doesn't change when I add routes to the switch.

As a benchmark, the X480 had no problems loading the full BGP table.  It was over a WAN link, and took 6 minutes 40 seconds, with dcbgp near the top of top for the duration.

Photo of Grosjean, Stephane

Grosjean, Stephane, Employee

  • 11,948 Points 10k badge 2x thumb
To check, the command is show iproute reserved-entries statistics.

The meaning is that the RIB has 570982 entries, from which 287247 have been compressed, meaning the FIB has been provided with 570982 - 287247 = 283735 entries.

What you see with the command you typed (show iproute reserved-entries - you were not far from the right command ;)) is the number of entries reserved for LPM entries in the LPM HW table. Which mean the rest may be used by other entries (ARP...).
Photo of Paul Thornton

Paul Thornton

  • 1,324 Points 1k badge 2x thumb
There's plenty of headroom there, given that we have no intention of putting a full v4 or v6 Internet routing table in the switch.

Thanks for the quick answer, Stephane.