NAT Error "Global IP addresses exhausted for pool"

  • 0
  • 1
  • Question
  • Updated 3 years ago
  • Answered
  • (Edited)

We have implemented a SSA 150 as a central core device in our Network.

We have a full Public Class C Network but have only defined 10 NAT Pools with one Public IP for every pool.

The pools are for different VLAN ́s and worker Groups .. Employments / Guests ...

There are nearly 100 - 300 devices in every Group Online ...


On my syslog Server this error comes since 4 days  20 times a day :

RtrNat[1]Router global: Failed to allocate ip address (Global IP addresses
exhausted for pool) reported 1 times


Nearly .. 10 times on a sunday with not a lot of traffic.


What is the Problem there ? .... can one public IP only handle a limited private Nat translations...

Of course i could grow up the pools an give one pool 3 or 4 public IP ́s ,,,

But first i want to know if this is Fixing the Problem.

Chris         





Photo of info@systemhaus-genthin.de

Posted 3 years ago

  • 0
  • 1
Photo of Jeremy

Jeremy, Embassador

  • 9,788 Points 5k badge 2x thumb
One user doing a port scan on the internet could easily exhaust a single IP NAT pool.
Photo of Jeremy

Jeremy, Embassador

  • 9,788 Points 5k badge 2x thumb
Or BitTorrent traffic can use a lot.
Yes ,..  the error itself is clear .. but where can i find the limit for this pool so that know how to increase the pool ....
What is the limit for one Public IP ??
And how to see the actual usage of this pool ...

some Screens attached :
(Edited)
Photo of Jeremy

Jeremy, Embassador

  • 9,788 Points 5k badge 2x thumb
Hmm...  there should be around ~65,536 for 1 IP address.  I also don't see the counters going up.  I think if the message is not happening on a very regular basis (not happening every few seconds), it could just be a NAT miss.  I see these on our Cisco ASR all the time.

*Sep 28 11:19:07: %IOSXE-6-PLATFORM: F0: cpp_cp: QFP:0.0 Thread:087 TS:00001467309607427649 %NAT-6-ADDR_ALLOC_FAILURE: Address allocation failed; pool 10 may be exhausted

I know the pool isn't exhausted.  But for some reason the ASR can't allocate a port translation.  Cisco said this isn't anything to worry about as long as it's not constent. 

that sounds good .. but the entry on "NO GLOBAL IP Adr: is counting 14241  ... the system is
only 2 weeks alive ...
(Edited)
Photo of Mike D

Mike D, Alum

  • 3,852 Points 3k badge 2x thumb
Hello Chris,


Jeremy's experienced input covers lots of ground: ~64k translations (minus well-known port range) theoretically possible per address -  more than the 150 series can handle in total.   Torrent behavior, port scans - in addition to having seen similar on other vendor equipment; that's all good info.

To that I'll add the S-Series NAT operation is quite robust.  the protocol is well understood, the firmware is mature and the app gets plenty of field exercise.  the switch itself is rock solid. 

Since the error seems to be tripped by a transient state - and no mention of user complaints, it sounds like the net impact is limited to those messages.   

Because of the overall stability I'm still a little concerned about these errors.  If they're to be believed, there are events sourced from your network that need investigation.  If they're the result of a bug, (in spite of the testimonial, stranger things have happened) it's one I haven't heard of and not something release notes indicate has ever been fixed.

I'd suggest a call to GTAC for a closer look at some of the variables that may be involved.  
   
Best regards,
Mike

There's a growing chance this forum will link you to a user with the exact answer to your exact complaint.  I've learned a good deal from watching these HUB exchanges.  Glad to see you use this resource as an early part of your investigation.
Photo of Mike D

Mike D, Alum

  • 3,852 Points 3k badge 2x thumb
Incidentally, 10 public ip addresses is the limit for NAPT  translation.  10  source list rules is max as well.  
As the error message complains of an out of resource condition, it would not be unreasonable to edit the config accordingly as a troubleshooting measure.
If I were to make a recommendation given the information at hand, it would have to be to use less public IP address space rather than more.   8 pools with 8 public addr's for example.  Speculation only of course - but tuning the config may be worth adding to your action plan.
One of my problems ist that i can not see which of the pools is affected .. there is only the syslog
message :                                    
Global IP addresses exhausted for pool

but no pool is pointed ...
Of course i can shrink the divide of IP Addresses on VLAN ́s .. i did this 10 pools to use my public IP Area better. In History we had often trouble with users ...thats why i wanted a very granular  splitting into seperate vlan and connected every vlan with one pool.
As i Understand Michael i should consolidate some of the pools to one pool with 3 or 4 public addresses.
Photo of Mike D

Mike D, Alum

  • 3,852 Points 3k badge 2x thumb
Hi Chris,

I understand your question but I don't know why you're seeing this error.  There's nothing pointing to single root cause or a course of action that’s sure to fix the problem.
At this stage there are probably better plans than a remix of your pools.  The idea will be handy to have in the toolbox for later.

With no clear root cause a methodical troubleshoot is typically the next step.  Unfortunately troubleshooting method requires trial and error - and while shared experience makes participating in the hub a no-brainer, the tedious back and forth of in depth network troubleshooting isn't always a great mix for this sort of forum.

I recommend the classic start - physical layer.  Then statistics and states at L2, then L3, then NAT application stats and tables.  There’s more than one way to approach this I’m certain but I don’t know any other way to do the work. 

I encourage others in the community to add troubleshooting tips or experiences that might improve odds of a quick resolution. 

That said, here are a few items to help the cause:  

* specific hardware and firmware; release note review is always of interest  
* NAT config. (Cone NAT etc)

*
Firewall/dmz location.  

* switch/router config. 
* physical topology; traffic flow in and out of the NAT.
* > Review L1 stats for high or low frame-counts, errors, flow control, LACP/LAG health, etc.
*
> Review L2 topology, stp and fdb stability. 

*  By default the switch collects 24 hrs rmon history.  review traffic spikes and time frames.  

This data may also point toward a problem source. 


* Event record and a reference to correlate events. 
* a gauge of the flow-count (unique sa/da-sip/dip-tcp/udp stream) on a port.  If a traffic event such a port scan occurred, the timestamp on the flowlimit stat high-water mark will help.  Correlate this with rmon history and log entries.



 Best regards,

Mike
(Edited)