ExtremeSwitching (EXOS)

 View Only
  • 1.  Hardware route table full issues

    Posted 01-26-2016 17:45
    I've just been troubleshooting a rather odd network latency problem.

    Scenario is that we have two X480s doing BGP with some Internet transit providers and peers. All fairly standard ISP setup - the transit providers give a default route, and the peers provide some more specifics - due to lack of room in the X480 for a full global IP routing table.

    We were seeing latency jumps of over 100ms on all packets going through one of these switches earlier (which happened to have all of the BGP sessions for the peer routes on it). This increase in latency went away when these peers were shut down and the peer routes removed from the routing table. No processes were maxing out the CPU at the top of top though, so it didn't look like a classic slow-path issue.

    The X480 should be able to cope with 256K IPv4 routes but only 8K IPv6 routes.

    The BGP feeds were providing, just before I closed the peers, a total of around 81000 IPv4 routes and 26000 IPv6 routes.

    With the extra peer routes removed, the total size of the routing table on the switch dropped to around 450 routes and everything was (and still is) happy again.

    The switch log was complaining about:
    01/26/2016 17:35:23.05 [i] IPv6 route not added to hardware. Hardware LPM Table full.
    which makes some sense given that 26K > 8K.

    So, to the key part of the question. Assuming IPv6 traffic was very low (it was), should the overflow of the IPv6 hardware table have affected IPv4 forwarding? It very much seemed to have done so in this instance; we were testing with IPv4 traffic to devices on the network and all suffered increased latency.

    And as a followup, is there a way to re-carve this on an X480 to give more IPv6 hardware entries?



  • 2.  RE: Hardware route table full issues

    Posted 01-26-2016 19:27

    The x480 has several configuration modes for IPv4 and IPv6. Assuming you have a "recent" EXOS version (there were some modifications around 15.2 or 15.4 timeframe, can't remember precisely which one), the config forwarding external-tables CLI command is offering specific mode to help with IPv6.

    the command show forwarding configuration can help you find out your current setup (default is l2-and-l3).

    Depending on your needs, you might want to look at the enhanced IPv6 settings:

    - l3-only ipv4-and-ipv6 gives 464k/48k for LPM (IPv4/IPv6)
    - l3-only ipv6 gives 16k/240k

    the other mode will allow only 8k for IPv6 LPM.

    You can also turn on compression for IPv4 and IPv6 (two separate commands). It can help significantly.

    Changing the forwarding configuration will require a reboot.

  • 3.  RE: Hardware route table full issues

    Posted 01-26-2016 19:49
    Hi Stephane,

    Ah. That was the magic command I was trying to remember to look at this. Thank you.

    The switch is running - I think that comes under the 'new enough' heading :)

    So we have:

    inet1.1 # show forwarding configuration

    L2 and L3 Forwarding table hash algorithm:
    Configured hash algorithm: crc32
    Current hash algorithm: crc32

    L3 Dual-Hash configuration:
    Configured setting: on
    Current setting: on
    Dual-Hash Recursion Level: 1

    Hash criteria for IP unicast traffic for L2 load sharing and ECMP route sharing
    Sharing criteria: L3_L4

    IP multicast:
    Group Table Compression: on
    Local Network Forwarding: slow-path
    Lookup-Key: (SourceIP, GroupIP, VlanId)

    External lookup tables:
    Configured Setting: l2-and-l3
    Current Setting: l2-and-l3

    Switch Settings:
    Switching mode: store-and-forward

    L2 Protocol:
    Fast convergence: on

    Fabric Flow Control:
    Fabric Flow Control: auto

    And from what you're saying, I need to switch from l2-and-l3 to l3-only use; and I had totally forgotten about compression (assuming you're talking about 'enable iproute compression').

    I can see that I need to do a:
    config forwarding external-tables l3-only ipv4-and-ipv6

    Is there any flexibility in that 464k/48k split for the V4/V6 routes - I was looking at 'config iproute reserved-entries ...' which looks like it may do what I want there. Obviously, any split you do between v4 and v6 routes is a tradeoff as the number of routes of each is increasing daily on the Internet.

    Quickly asking again about the cause of the issue earlier (I haven't opened a case on this yet, I thought I'd ask here to see if anyone had any ideas). If the V6 table was full, should we expect to see degraded performance across the whole switch, including V4 forwarding?


  • 4.  RE: Hardware route table full issues

    Posted 01-26-2016 20:16
    Yes, I was referring to iproute compression.

    for IPv4 it's: enable iproute compression.
    for IPv6 it's: enable iproute ipv6 compression

    As for the balance of IPv4/IPv6, unfortunately no, there's no such flexibility, you have to pick one of the predefined settings. The l3-only ipv4-and-ipv6 setting has been a long discussion on the total amount of IPv4 that was necessary. With iproute compression, it should have enough room in the FIB for both IPv4 and IPv6 full view.

    The config iproute reserved-entries will not help you in your case. It's more about allowing EXOS to use a part of the LPM for some clever optimization.

    As for the performance, I'm not sure. It would require investigation.

  • 5.  RE: Hardware route table full issues

    Posted 01-27-2016 07:19
    By logic, even if some IPv6 traffic is going slow-path, IPv4 entries in HW should have no impact. So, your experience would mean some IPv4 traffic was also going slow-path. Remember that one such reason could be some IPv4 traffic with IP Options in the header.

    To check that kind of thing:

    show iproute reserved-entries statistics

    show ipstats | inc Forw

    show ipstats ipv6 | inc Forw

    ... wait 10 seconds

    show ipstats | inc Forw

    show ipstats ipv6 | inc Forw

  • 6.  RE: Hardware route table full issues

    Posted 01-27-2016 12:34
    Thanks for all that. I'm very suspicious that the problem stopped when we dropped a lot of routes (all of the traffic coming via these sources would have just fallen over to a different link, and not gone away).

    What we'll do is bring the additional routes back in before changing anything else, and see if I can provoke the problem, and then collect the various ipstats from the switches. Once I have that, I'll update here and talk to the TAC if there's nothing obvious.

    I'm not sure that opening a case right now would help anyway as there is nothing to go on :(


  • 7.  RE: Hardware route table full issues

    Posted 01-27-2016 14:22
    Yes, it's usually difficult to find the reason of an issue when troubleshooting a (now) working network. :)

  • 8.  RE: Hardware route table full issues

    Posted 02-10-2016 14:19
    Hi all,

    A brief followup on this as I've been doing some lab testing today.

    I have an X480 here, and I've reconfigured it to have l3-only ipv4-and-ipv6 forwarding configuration and ip route compression enabled. All went as expected.

    So, in the interests of science, I sent it a full Internet routing table:

    * lab_inet1.63 # show iproute summary
    =================ROUTE SUMMARY=================
    Mask distribution:
    1 default routes 16 routes at length 8
    13 routes at length 9 35 routes at length 10
    101 routes at length 11 265 routes at length 12
    507 routes at length 13 1029 routes at length 14
    1764 routes at length 15 12944 routes at length 16
    7346 routes at length 17 12372 routes at length 18
    25189 routes at length 19 36581 routes at length 20
    38614 routes at length 21 63736 routes at length 22
    54674 routes at length 23 315613 routes at length 24
    1 routes at length 25 1 routes at length 26
    12 routes at length 27 20 routes at length 28
    38 routes at length 29 59 routes at length 30
    7 routes at length 31 44 routes at length 32

    Route origin distribution:
    570874 EBGP 2 Blackhole 31 Static
    75 Direct

    Total number of routes = 570982
    Total number of compressed routes = 287247

    My question here is do the last two lines mean it has 570K routes in total (it does, that's pretty much the size of the global routing table), of which 287K are compressed?

    So am I actually using up 283K routes in the hardware table? Or 570K? Or somewhere in between? Is there a way to find out?

    I see this, and no log entries, so as it has more than 458K routes the hardware isn't overfilled:

    lab_inet1.65 # show iproute reserved-entries
    IPv4 # Reserved Routes Minimum #
    Slot Type Routes IPv4 (or IPv6) IPv4 Hosts
    ---- ---------------- -------- ------- ------------------ ----------
    1 X480-24x(10G4X) External 458720 ( ext.) [default] 16384

    but that doesn't change when I add routes to the switch.

    As a benchmark, the X480 had no problems loading the full BGP table. It was over a WAN link, and took 6 minutes 40 seconds, with dcbgp near the top of top for the duration.


  • 9.  RE: Hardware route table full issues

    Posted 02-10-2016 14:31
    To check, the command is show iproute reserved-entries statistics.

    The meaning is that the RIB has 570982 entries, from which 287247 have been compressed, meaning the FIB has been provided with 570982 - 287247 = 283735 entries.

    What you see with the command you typed (show iproute reserved-entries - you were not far from the right command ;)) is the number of entries reserved for LPM entries in the LPM HW table. Which mean the rest may be used by other entries (ARP...).

  • 10.  RE: Hardware route table full issues

    Posted 02-10-2016 14:41
    There's plenty of headroom there, given that we have no intention of putting a full v4 or v6 Internet routing table in the switch.

    Thanks for the quick answer, Stephane.