ExtremeWireless (WiNG)

Expand all | Collapse all

wing 5.8.5 reports AP's as down but they are not

  • 1.  wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 09:28
    Hi All
    We had a power outage last week, some parts of the network stayed up on the UPS and others the UPS's failed after 20 mins. since then the RFS shows that some of the AP's are down, but they have clients connected, I can get onto the AP's being shown as down via the AP's ip address, I have reset the units, but the RFS ( wing 5.8.5 ) still shows as down ?
    I have also restarted the RFS units as well.
    now Im confused ( Oh no I'm always confused , thats normal )



  • 2.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 10:24
    Hi Phil the RFS showing some APs are down, is pointing to an adoption issue. If you log into the controllers CLI, ans do a "show adoption offline", do the 'DOWN APs show up?. You can also log directly into the AP and enter "show adoption st". It will tell you the AP is not adopted.

    The AP will be able to handle users if all the VLANS are locally bridged. In this configuration, the WLAN does not require any of the CONTROLLER's resource.

    The Adoption issue will require some investigation. If the AP is on a VLAN different from the Controller, it will nee an "controller host" entry to point back to the Controller



  • 3.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 10:31
    Phil once you're in via putty ( CLI) in the AP's/RFS also share the output of command 'show version' for both.


  • 4.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 15:09
    Hi Phil,

    As Andy expressed and Rob alluded to. Please verify the AP adoption status and that they show adopted and configured.



  • 5.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 17:35
    Additional you can check if you see MINT neighbors.

    show mint neighbors

    MINT is used for communicating between all WiNG devices. Are all devisees using the same VLAN?


  • 6.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 18:08
    Since you had a power outage, I would start with ensuring uplinks/VLANs/trunking is all still properly configured from the switches back to the RFS.


  • 7.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-29-2017 18:17
    Good point Justin, it's also possible that there were uncommitted changes made to the configuration that were lost with the power outage.

    Phil, hopefully you have a backup of the configuration and can verify the settings are the same.



  • 8.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 03:33
    Hi
    This is very strange, So we have a mixture of AP7532's and 7131, split accross two server rooms,
    the AP7532's have all come back, but none of the AP7131's
    The AP's connect to a Nortel 5520 in either server room and the switches are part of a stack. the AP's sit on VLAN 1 ( flat network )The two RFS units show up in the gui,
    I have restarted teh AP7131 and the notel 5520's

    The show mint neighbours
    rfs7000-Backup(config)*#sh mint neighbors
    5 mint neighbors of 70.38.0A.F9:
    4D.80.C3.AC (ap7532-MO-Nr-HR) at level 1, best adjacency vlan-1
    4D.80.C5.F4 (AP7532-ICT-B4a) at level 1, best adjacency vlan-1
    4D.80.C6.24 (ap7532-B4-Stores) at level 1, best adjacency vlan-1
    4D.82.BD.80 (ap7532-B4-CommsRoom) at level 1, best adjacency vlan-1
    70.81.BE.8E (rfs7000-Primary) at level 1
    Even the Primary RFS is no showing as down in the gui from the backup unit

    The AP732's seem fine its Just the AP7131 units, although 1 unit is showing and Both RFS units are working


  • 9.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 10:52
    Sounds like your cluster could be at issue. Make sure both members of the cluster are present and it is working as expected.

    From the cli:

    show cluster members[/code]



  • 10.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 14:04
    Hi Andrew
    the cluster is their
    rfs7000-Backup* 70.38.0A.F9 00-15-70-38-0A-F9 False standby 00:01:30 ago
    rfs7000-Primary 70.81.BE.8E 00-15-70-81-BE-8E True active self

    But I might have a bigger problem, I set the syslog running and this is is what is comming in

    Jun 30 16:02:46 172.17.146.105 2017-06-30T16:02:46.139223+01:00 rfs7000-Primary %DATAPLANE-4-DOSATTACK: IPSPOOF ATTACK: Source IP is Spoofed : Src IP : 10.0.0.138, Dst IP: 224.0.0.22, Src Mac: 58-98-35-9D-7A-44, Dst Mac: 01-00-5E-00-00-16, Proto = 2.
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.698984+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-15-70-EB-7D-00(ap7131-2) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.704019+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-24-38-F3-72-00(ap7131-5) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.710750+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-15-70-EB-96-CC(ap7131-4-PC02) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:03:02 172.17.146.105 2017-06-30T16:03:02.784410+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-09-64-DE, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.152.4, ARP Target IP: 172.17.144.81 .
    Jun 30 16:03:05 172.17.146.105 2017-06-30T16:03:05.921494+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9A-2E-5E, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.148.19, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:07 172.17.146.105 2017-06-30T16:03:07.760199+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-14-5E-A4-7D-04, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9D-4A-76, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.151.26, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:24 172.17.146.105 2017-06-30T16:03:24.183555+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-09-64-DE, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.152.4, ARP Target IP: 172.17.144.59 .
    Jun 30 16:03:26 172.17.146.105 2017-06-30T16:03:26.885160+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: Jun 30 16:02:46 172.17.146.105 2017-06-30T16:02:46.139223+01:00 rfs7000-Primary %DATAPLANE-4-DOSATTACK: IPSPOOF ATTACK: Source IP is Spoofed : Src IP : 10.0.0.138, Dst IP: 224.0.0.22, Src Mac: 58-98-35-9D-7A-44, Dst Mac: 01-00-5E-00-00-16, Proto = 2.
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.698984+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-15-70-EB-7D-00(ap7131-2) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.704019+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-24-38-F3-72-00(ap7131-5) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:02:52 172.17.146.105 2017-06-30T16:02:52.710750+01:00 rfs7000-Primary %DEVICE-4-OFFLINE: Device 00-15-70-EB-96-CC(ap7131-4-PC02) is offline, last seen:10 minutes ago on switchport -
    Jun 30 16:03:02 172.17.146.105 2017-06-30T16:03:02.784410+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-09-64-DE, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.152.4, ARP Target IP: 172.17.144.81 .
    Jun 30 16:03:05 172.17.146.105 2017-06-30T16:03:05.921494+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9A-2E-5E, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.148.19, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:07 172.17.146.105 2017-06-30T16:03:07.760199+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-14-5E-A4-7D-04, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9D-4A-76, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.151.26, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:24 172.17.146.105 2017-06-30T16:03:24.183555+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-09-64-DE, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.152.4, ARP Target IP: 172.17.144.59 .
    Jun 30 16:03:26 172.17.146.105 2017-06-30T16:03:26.885160+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-11-25-8E-2E-5E, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9A-2E-5E, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.148.19, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:28 172.17.146.105 2017-06-30T16:03:28.723856+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-14-5E-A4-7D-04, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9D-4A-76, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.151.26, ARP Target IP: 172.17.144.71 .
    , ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.148.19, ARP Target IP: 172.17.144.71 .
    Jun 30 16:03:28 172.17.146.105 2017-06-30T16:03:28.723856+01:00 rfs7000-Primary %DATAPLANE-4-ARPPOISON: ARP CACHE POISONING: Conflicting ethernet header and inner arp header :Ethernet Src Mac: 00-14-5E-A4-7D-04, Ethernet Dst Mac: FF-FF-FF-FF-FF-FF, ARP Src Mac: 00-03-FF-9D-4A-76, ARP Dst Mac: 00-00-00-00-00-00, ARP Src IP: 172.17.151.26, ARP Target IP: 172.17.144.71 .

    I have enabled Arp trust on ge1

    no stateful-packet-inspection-l2

    no sure if this is correct,

    This is where I'm clueless or should it be more clueless than usual :-(



  • 11.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 14:32
    looking on the RFS events The AP's seem to be avaiable and then drop off




  • 12.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 15:33
    Phil,

    Are there any other RFS devices connected anywhere on the network besides the two RFS7000s ?

    As you said, the 7131 seem to be dropping off the network, so you'd need to track down why this is happening.

    I find that the show mint mlcp history command is the most useful for troubleshooting adoption issues, but it takes some time to figure out what all the messages indicate.

    You can try the following from the CLI of the Primary RFS:

    show mint mlcp history[/code] The command will give you an output of all the mint level handshake going on between devices.

    Here's what it looks like on the RFS when an AP is adopted (time goes backwards):

    2017-06-14 09:37:10:Adopted 5C-0E-8B-34-E3-28 (0B.34.E3.28), cfgd notified
    2017-06-14 09:37:10:Sending MLCP Offer to 0B.34.E3.28 (link_level=1, preferred=0, capacity=144)
    2017-06-14 09:37:10:Sending MLCP Offer to 0B.34.E3.28 (link_level=1, preferred=0, capacity=144)
    2017-06-14 09:37:10:Sending MLCP Offer to 0B.34.E3.28 (link_level=1, preferred=0, capacity=144)
    2017-06-14 09:37:10:Sending MLCP Reply to (00.00.00.00,34,1,227.40.0.0:23566/5C-0E-8B-34-E3-28)
    2017-06-14 09:37:10:Sending MLCP Offer to 0B.34.E3.28 (link_level=1, preferred=0, capacity=144)
    To get the view from the 7131, you will also need to issue the same command on one of the 7131s that is not adopting. You should be able to SSH to it.

    And this is what it looks like from the AP's perspective:

    2017-06-14 09:37:10:Received OK from cfgd, adoption complete to 0B.1B.2A.E2
    2017-06-14 09:37:10:Waiting for cfgd OK, adopter should be 0B.1B.2A.E2
    2017-06-14 09:37:10:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-06-14 09:37:10:Adoption state change: 'No adopters found' to 'Connecting to adopter'
    2017-06-14 09:37:10:Try to adopt to 0B.1B.2A.E2 (cluster master 0B.1B.2A.E2 in adopters)




  • 13.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-02-2017 18:17
    Hi, This is the output from one of the AP7131's, Its very strange that the AP7532's seem unaffected. if any ap's were set to controller capable would that affect anything ?

    I will drop the output of the RFS later, as they too have now dropped off the network
    it goes from bad to worse 😞

    ap7131-7-PC01(config)#show mint mlcp history
    2017-07-02 20:39:37:Adoption state change: 'Waiting to retry' to 'No adopters found'
    2017-07-02 20:39:26:cfgd notified dpd2 of unadoption, restart adoption after 11 seconds
    2017-07-02 20:39:26:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 20:39:26:Adopter 70.38.0A.F9 is no longer reachable, cfgd notified
    2017-07-02 20:39:26:All adopters lost, restarting MLCP
    2017-07-02 20:39:26:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:52:15:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:52:15:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:52:15:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:52:13:Adoption state change: 'No adopters found' to 'Connecting to adopter'
    2017-07-02 19:52:13:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:51:14:Adoption state change: 'Waiting to retry' to 'No adopters found'
    2017-07-02 19:51:05:cfgd notified dpd2 of unadoption, restart adoption after 9 seconds
    2017-07-02 19:51:05:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:51:05:Adopter 70.38.0A.F9 is no longer reachable, cfgd notified
    2017-07-02 19:51:05:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:51:05:MLCP VLAN link already exists
    2017-07-02 19:51:05:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:51:05:All adopters lost, restarting MLCP
    2017-07-02 19:42:42:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:42:42:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:42:42:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:42:42:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:42:42:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:42:12:Adoption state change: 'Connecting to adopter' to 'Adoption failed': Connection error 145
    2017-07-02 19:41:47:Adoption state change: 'Waiting to retry' to 'Connecting to adopter'
    2017-07-02 19:41:47:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:41:40:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:41:40:MLCP VLAN link already exists
    2017-07-02 19:41:40:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:41:40:All adopters lost, restarting MLCP
    2017-07-02 19:41:38:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:41:38:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:41:38:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:41:38:cfgd notified dpd2 of unadoption, restart adoption after 9 seconds
    2017-07-02 19:41:38:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:40:50:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:40:50:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:40:50:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:40:50:Adoption state change: 'Waiting to retry' to 'Connecting to adopter'
    2017-07-02 19:40:50:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:40:38:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:40:38:MLCP VLAN link already exists
    2017-07-02 19:40:38:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:40:38:All adopters lost, restarting MLCP
    2017-07-02 19:40:38:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:40:38:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:40:38:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:40:38:cfgd notified dpd2 of unadoption, restart adoption after 12 seconds
    2017-07-02 19:40:38:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:39:43:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:39:42:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:39:42:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:39:39:Adoption state change: 'No adopters found' to 'Connecting to adopter'
    2017-07-02 19:39:39:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:39:34:Adoption state change: 'Waiting to retry' to 'No adopters found'
    2017-07-02 19:39:29:cfgd notified dpd2 of unadoption, restart adoption after 5 seconds
    2017-07-02 19:39:29:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:39:29:Adopter 70.38.0A.F9 is no longer reachable, cfgd notified
    2017-07-02 19:39:29:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:39:29:MLCP VLAN link already exists
    2017-07-02 19:39:29:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:39:29:All adopters lost, restarting MLCP
    2017-07-02 19:38:48:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:38:48:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:38:48:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:38:48:Adoption state change: 'Waiting to retry' to 'Connecting to adopter'
    2017-07-02 19:38:48:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:38:38:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:38:38:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:38:38:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:38:38:cfgd notified dpd2 of unadoption, restart adoption after 10 seconds
    2017-07-02 19:38:38:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:38:23:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:38:23:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:38:23:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:38:23:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:38:23:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:37:53:Adoption state change: 'Connecting to adopter' to 'Adoption failed': Connection error 145
    2017-07-02 19:37:27:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:37:27:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:37:24:Adoption state change: 'Waiting for Adoption OK' to 'Adoption failed': Cluster master is unknown
    2017-07-02 19:37:24:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:37:24:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:37:24:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:37:21:Adoption state change: 'Waiting for Adoption OK' to 'Adoption failed': Cluster master is unknown
    2017-07-02 19:37:21:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:37:21:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:37:21:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:37:18:Adoption state change: 'Waiting for Adoption OK' to 'Adoption failed': Cluster master is unknown
    2017-07-02 19:37:18:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:37:18:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:37:18:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:37:15:Adoption state change: 'Waiting for Adoption OK' to 'Adoption failed': Cluster master is unknown
    2017-07-02 19:37:15:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:37:12:Adoption state change: 'No adopters found' to 'Connecting to adopter'
    2017-07-02 19:37:12:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:37:11:Adoption state change: 'Adoption failed' to 'No adopters found'
    2017-07-02 19:37:00:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:37:00:MLCP VLAN link already exists
    2017-07-02 19:37:00:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:37:00:All adopters lost, restarting MLCP
    2017-07-02 19:36:41:Adoption state change: 'Connecting to adopter' to 'Adoption failed': Connection error 145
    2017-07-02 19:36:16:Adoption state change: 'Waiting to retry' to 'Connecting to adopter'
    2017-07-02 19:36:16:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:36:10:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:36:10:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:36:10:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:36:10:cfgd notified dpd2 of unadoption, restart adoption after 6 seconds
    2017-07-02 19:36:10:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:34:11:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:34:10:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:34:10:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:34:10:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:34:10:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    2017-07-02 19:33:40:Adoption state change: 'Connecting to adopter' to 'Adoption failed': Connection error 145
    2017-07-02 19:33:15:Adoption state change: 'Waiting to retry' to 'Connecting to adopter'
    2017-07-02 19:33:15:Try to adopt to 70.38.0A.F9 (cluster master 70.38.0A.F9 in adopters)
    2017-07-02 19:33:08:MLCP created VLAN link on VLAN 1, offer from 00-15-70-38-0A-F9
    2017-07-02 19:33:08:Sending MLCP Request to 00-15-70-38-0A-F9 vlan 1
    2017-07-02 19:33:08:MLCP link vlan-1 offerer 70.38.0A.F9 lost, restarting discovery
    2017-07-02 19:33:08:cfgd notified dpd2 of unadoption, restart adoption after 7 seconds
    2017-07-02 19:33:08:Adoption state change: 'Adopted' to 'Waiting to retry'
    2017-07-02 19:29:51:Received OK from cfgd, adoption complete to 70.38.0A.F9
    2017-07-02 19:29:51:Waiting for cfgd OK, adopter should be 70.38.0A.F9
    2017-07-02 19:29:51:Adoption state change: 'Connecting to adopter' to 'Waiting for Adoption OK'
    2017-07-02 19:29:51:Adoption state change: 'Adoption failed' to 'Connecting to adopter'
    2017-07-02 19:29:51:Try to adopt to 70.38.0A.F9 (cluster master 00.00.00.00 in adopters)
    ap7131-7-PC01(config)#



  • 14.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-02-2017 20:00
    Phil,

    Looking at your dump, it appears as if the config being pushed out the APs once they are adopted is "breaking" the connectivity with the RFS. I can clearly see the AP is actually adopting, but then going into Waiting to retry state because it lost contact with the RFS.

    The Backup RFS is also doing the adoption, which seems to differ from your show cluster members output of the other day.

    Have a close look at the configuration going into the APs with the following command from the RFS:

    show run device device_name[/code]

    This will show you the final, complete, config that is going to be sent to the device when it adopts, and the profiles are folded down, so you can actually see what the AP is going to get.

    Pay close attention to the last block of configuration which will be the device itself, and particularly interface ge1 and vlan 1 configuration, then compare it with the configuration going out to the 7532s using the same command. I suspect that something may have been modified in the 7131's profile hence the breakage you're experiencing.



  • 15.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-03-2017 06:33
    Hi, this is the output from the rfs ( show run deveice ) the last block
    I'm not sure what bit I should be looking at.

    interface ge1
    speed 1000
    switchport mode trunk
    switchport trunk native vlan 1
    no switchport trunk native tagged
    switchport trunk allowed vlan 1-10
    no cdp receive
    no cdp transmit
    no lldp receive
    no lldp transmit
    interface ge2
    shutdown
    interface vlan1
    ip address dhcp
    ip address zeroconf secondary
    ip dhcp client request options all
    interface wwan1
    interface pppoe1
    use event-system-policy AP-Down
    use firewall-policy default
    ntp server 172.17.144.150 prefer version 3
    ntp server 172.17.144.151 version 3
    use role-policy RBFW
    email-notification host
    email-notification recipient
    logging on
    controller hello-interval 60 adjacency-hold-time 180
    service pm sys-restart
    no upgrade opcode auto
    no upgrade opcode path
    no upgrade opcode reload
    traffic-shape enable

    It looks OK


  • 16.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-03-2017 11:22
    Looking at the syslog I also see this
    %DATAPLANE-4-RAGUARD: RA-GUARD: router advertisement/redirect from/to untrusted port/wlan 0, vlan 1 : Src IP : fe80:0:0:0:217:c5ff:fe99:67a0, Dst IP: ff02:0:0:0:0:0:0:1, Src Mac: 00-17-C5-99-67-A0, Dst Mac: 33-33-00-00-00-01, ICMP type = 134, ICMP code = 0, Proto = 58.

    Here its saying untrusted port/wlan 0 - there is no wlan 0 or port 0 i believe

    It looks like IP6 which we do not use, is it worth turning the IP6 stuff off ?



  • 17.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-03-2017 12:50
    Phil,

    Don't worry about IPv6.

    The config looks ok, except the port speed setting. As a general rule, gigabit networks should not use forced speed/duplex settings, in fact the standard mandates auto negotiation.

    Your original post mentioned that all this started after a power failure, so I'm guessing some unsaved config changes got lost.

    Check AP 7131 vs AP 7532 config differences, as well as the switch-port config of the respective switches they are connected to. The mint MLCP output clearly shows APs getting adopted then dropping immediately afterward, indicating something about the config is breaking the connectivity.

    Beyond this, I think some one-on-one troubleshooting and/or opening a case with GTAC is in order.



  • 18.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 01:45
    Can you login to one of the APs and provide output of CLI command 'sh addoption history'


  • 19.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 03:39
    Hi, this is the output from the the sh adoption history.

    the RFS units are up and can see each other, they have ge1 set as a trunk port with Vlan 1 as the native vlan ( currently not tagged ) and allowed vlans are 1 & 10
    the network switches have only 2 vlans 1 & 10 and the ports are allowed ( tagall )

    The AP/s have the Ge1 set as a trunk port with allowed vlans 1 & 10 ( native vlan untagged ) then there is wlan to vlan map in the wireless for the two wifi networks

    The network switches are Nortel currenly ( hoping to move to extreme in the very near future )
    but for now its nortel. This has all become a mistry as to what has gone wrong, Myself I think its the network switches, but its proving it. I suppose the other thing I could do is get a spare switch default
    then connect the rfs and some AP's in and see what happens ?

    -------------------------------------------------------------------------------- --------------------
    MAC TYPE EVENT TIME-STAMP REASON
    -------------------------------------------------------------------------------- --------------------
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:25:17 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:24:53 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:24:42 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:23:50 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:17:36 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:17:15 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:17:10 Receive d reset from switch 70.81.BE.8E, {'reason': 'controller cfgd is not your adopte r due to misadoption'}
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:15:58 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:09:36 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:09:20 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:09:07 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:08:15 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:01:58 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:01:36 N.A.
    00-15-70-81-BE-8E RFS7000 un-adopted 2017-07-04 06:01:24 Adopter 70.81.BE.8E is no longer reachable
    00-15-70-81-BE-8E RFS7000 adopted 2017-07-04 06:00:33 N.A.
    --------------------------------------------------------------------------------



  • 20.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 11:59
    This gets stranger by the second, so when I look at the rfdomain it indicates there are 8 devices online, when you select the pie chart it show 8, but when you select the rfdomain from the tree view
    it shows 6 and two of those are the RFS7k.
    If I go to statistics and offline devices it show the units that are offline with one ap connected to a device that I believe is a wired IP polycom phone , I conneted to the AP with a serial cable, and logged in, it then started scrolling messages about IPSpoof
    and then showing IPs that are in our Range
    "st Mac: 01-00-5E-00-00-FB, Proto = 17.
    Jul 04 13:41:17 2017: %DATAPLANE-4-DOSATTACK: IPSPOOF ATTACK: Source IP is Spoo fed : Src IP : 172.17.152.31, Dst IP: 224.0.0.251, Src Mac: F4-F5-D8-AA-DB-66, D st Mac: 01-00-5E-00-00-FB, Proto = 17.
    Jul 04 13:41:17 2017: %DATAPLANE-4-DOSATTACK: IPSPOOF ATTACK: Source IP is Spoo fed : Src IP : 172.17.150.53, Dst IP: 224.0.0.251, Src Mac: 50-65-F3-46-48-62, D st Mac: 01-00-5E-00-00-FB, Proto = 17.
    Jul 04 13:41:17 2017: %DATAPLANE-4-DOSATTACK: IPSPOOF ATTACK: Source IP is Spoo fed : Src IP : 172.17.146.137, Dst IP: 224.0.0.252, Src Mac: 00-15-5D-90-CA-61,"

    The AP is now powered off.
    The syslog is still showing %dataplane-4-DOSATTACK:ipspoof attack : source ip is spoofed 10.0.0.138 then mac etc

    I have a know working config from the RFS taken on the 26/5/17 when wifi bridge was setup and working, although the bridge is not in place at present.

    I'm not sure what to do now, default the primary and backup, fire the config back in then set the cluster back up ?



  • 21.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 12:16
    Phil,

    If you shut off the backup RFS, does the system start working properly again?

    If it does, then its pretty simple to factory default the backup unit and have it rejoin the cluster.



  • 22.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 13:17
    Hi Andrew

    I will shut the backup unit down and see what happens

    I have run the show mint mlcp history again and it shows the following where it shows link_level=1, preferred=0 what does this relate to ?

    2017-07-04 15:54:15:Sending MLCP Offer to 19.6B.76.C0 (link_level=1, preferred=0, capacity=1024)
    2017-07-04 15:53:34:Sending MLCP Offer to 4D.80.C5.F4 (link_level=1, preferred=0, capacity=1024)


  • 23.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-04-2017 13:29
    What you are seeing is offers to adopt to the APs.

    link_level=1 refers to the mint level, in this case think of level1 = layer2

    preferred = 0, no specific preference, not even sure if that flag is used.

    capacity=1024 is the maximum (not licensed) capacity that the appliance will support.



  • 24.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-06-2017 06:56
    Still confused as to what has happened. But looking at the online devices I have to that report they are connected to SEP:64167f829866: port1 , This appears to be a mac address of a Polycom VvX201 VOIP phone, I restart the IP phone then refresh the online devices view and then it show the two AP's as being connected to Primary RFS ge1, Then as soon as the phone has booted the ap's show as being connected to the phone again. None of the other online devices are connected to the anything ?


  • 25.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-06-2017 11:04
    Phil,

    By what means are you "looking at online devices"? It appears as if some LLDP or CDP packets are perhaps clouding the issue.

    I think this issue is going beyond the abilities of troubleshooting in this forum. I think you're going to need some hands-on assistance to get this resolved.

    If you have software support on your APs, please reach out to GTAC, they can assist. If you don't have support on your APs, software support is not as expensive as you might think, and in addition to calling GTAC, it also entitles you to the latest version.



  • 26.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-07-2017 08:30
    Hi Andrew
    I think I will take the RFS units offline and remove the startup-confing from them then fire tha last known config back in, this was taken on the 26/5/17, As I know this config was working it should rule out the RFS and the AP's. The plan is to do this tomorrow morning.

    I have setup a new vlan on the nortel switches vlan800 - now this is where i struggle a bit
    if I setup vlan 800 do i set the cluster vlan to 800 on the RFS units and then on the AP's set the controller valn to 800 ?

    then on the AP under networks ge1 set the native vlan to 800 ? and then allow the other two vlans 1,10



  • 27.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 07-12-2017 18:50
    Its all back up and working,
    Broke the cluster , cleared the startup-configs on primary and backup units then took each AP down and deleted the startup-config , Pushed the last good startup-config to the primary RFS and put the AP's back online one at a time.

    VLAN800 will be for the ap's to communicate woth the RFS,
    So unger GE1on the rfs the mode is trunk
    natvive Vlan 800
    then tag the native vlan
    and the allowed vlans 1,10

    On the nortel switch on the port the a member or vlan 1,10,800 and tagall

    Under cluster do ia set the cluster valn to 800

    and on the AP under adoption do I set the controller vlan to 800

    the under ethernet ports ge1 set the native vlan to 800 and tag it and then allow vlans 1,10

    then on the switch the AP's connect to have the port a member of vlan 800,1,10 and tagall ?



  • 28.  RE: wing 5.8.5 reports AP's as down but they are not

    Posted 06-30-2017 15:33
    On this note... at one time another provider installed a controller that started adopting our APs because we had a part of their network accessible. That's a good angle to search on.