08-28-2018 06:53 AM
Solved! Go to Solution.
09-04-2018 12:49 PM
Thanks for those logs Michael, I've emailed you a more detailed explanation of what we saw, but just in case anyone else has the same problems I wanted to post a brief overview of what we found and what we need to look at next.
In the data we were seeing failed .lpr files (you can see this by looking in the buffered log and CTRL+F searching for ".lpr", if you see the word "failed" on the same line, you know they aren't getting through). This indicates we aren't getting traffic through a firewall or content filter, or that there is a delay on the backend network.
We also saw that we were failing to reach the VHM server via http, which also indicates a firewall issue.
Finally we were seeing echo time outs. The HiveManager and AP (or any other aerohive device) have a call and response system to make sure that the APs are still responding to the HiveManager and therefore can be considered connected to the HiveManager. If the AP does not response to enough call and response echo packets, the HiveManager considers that device to be disconnected until it starts responding to echos again. This also indicates either a slow down on your network traffic, or a firewall issue.
If we are sure that the firewall is allowing outbound traffic on UDP 12222, TCP 22, TCP 443, and HTTP 80, then we'll want to run iPerf tests to see if we can find where the traffic is slowing down on the backend network. I sent you a guide that covers how to set up and run iPerf tests for reference.
09-04-2018 12:49 PM
Thanks for those logs Michael, I've emailed you a more detailed explanation of what we saw, but just in case anyone else has the same problems I wanted to post a brief overview of what we found and what we need to look at next.
In the data we were seeing failed .lpr files (you can see this by looking in the buffered log and CTRL+F searching for ".lpr", if you see the word "failed" on the same line, you know they aren't getting through). This indicates we aren't getting traffic through a firewall or content filter, or that there is a delay on the backend network.
We also saw that we were failing to reach the VHM server via http, which also indicates a firewall issue.
Finally we were seeing echo time outs. The HiveManager and AP (or any other aerohive device) have a call and response system to make sure that the APs are still responding to the HiveManager and therefore can be considered connected to the HiveManager. If the AP does not response to enough call and response echo packets, the HiveManager considers that device to be disconnected until it starts responding to echos again. This also indicates either a slow down on your network traffic, or a firewall issue.
If we are sure that the firewall is allowing outbound traffic on UDP 12222, TCP 22, TCP 443, and HTTP 80, then we'll want to run iPerf tests to see if we can find where the traffic is slowing down on the backend network. I sent you a guide that covers how to set up and run iPerf tests for reference.
09-04-2018 07:12 AM
Hi Sam,
Logs and info sent.
Tx,
Mike
08-29-2018 08:50 PM
We were unable to pass traffic on port TCP 22 during one of those tests, which could cause an update to fail. We can be sure that is the cause if we run the following debugs on the AP, replicate the issue, and then pull techdata.
Debugs:
_debug capwap info
_debug capwap basic
_debug capwap stat
If you want to send the tech data to me directly at communityhelp@aerohive.com I can review it for you to let you know what we find.
If you think it might be a rule on the AP firewall, could you also provide a screen shot of the rules you have in your IP firewall configuration?
08-29-2018 01:58 PM
That's odd, its all on a local network and theres no firewalls in place internally. Perhaps a rule on the AP itself?
AH-MikesOffice#exec _test tcp-service host 192.168.4.7 port 443
Testing TCP connection for host=192.168.4.7, port=443, timeout=10 seconds
Test successfully.
AH-MikesOffice#exec _test tcp-service host 192.168.4.7 port 22
Testing TCP connection for host=192.168.4.7, port=22, timeout=10 seconds
Test failed:Connection refused, maybe the TCP service on the port doesn't provide.
AH-MikesOffice#