Header Only - DO NOT REMOVE - Extreme Networks

10gig optic port goes down with remote fault, needs power down and up to get it back once a while


I have an extreme switch recently configured, the optical port is not working very good. The link state is always active for a while(couple of hours to couple of days), and then change to link state "ready" and never come back until I power the whole switch down and up.

Recent log, related to this issue is :
09/29/2015 22:05:50.48 [i] Port 45 link down - remote fault 09/29/2015 22:05:50.47 [i] Port 46 link down - remote fault
09/29/2015 07:02:32.41 [i] Port 45 link UP at speed 10 Gbps and full-duplex
09/29/2015 07:02:32.40 [i] Port 46 link UP at speed 10 Gbps and full-duplex
09/29/2015 07:02:27.37 [i] Port 45 link down - Local fault
09/29/2015 07:02:27.36 [i] Port 46 link down - Local fault

Seems like the local fault can be recovered by the switch itself, but if the remote fault occurs, the port won't come back. Does anybody know what the remote fault really means?

Switch Type: X670V-48t
EXOS version: 15.2.2.7

detailed transceiver information on the extreme switch side:
SFP or SFP+: SFP+
Signal: present
TX Fault: no
SFP/SFP+ Vendor: AVAGO
SFP/SFP+ Part Number: AFBR-709SMZ
SFP/SFP+ Serial Number: AD15163022E
SFP/SFP+ Manufacture Date: 150413
SFP/SFP+ Type: SFP
Connector: LC
Type: SR
Supported: no

Wavelength: 850
GBIC supports DDMI. MonitorType: 68

On the other end, we are connecting to EXFO QA-805 10 gig optic ports with transceivers supported by them.

Thanks!

17 replies

Userlevel 6
The remote fault means that the disconnect is coming from the other side. Can you check the "show port rxerror" on both those ports?
Patrick Voss wrote:

The remote fault means that the disconnect is coming from the other side. Can you check the "show port rxerror" on both those ports?

Hi Patrick, thanks for your quick response. I think the show command shows nothing like the following.

X670V-48t.2 # show port 46 rxerror
Port Rx Error Monitor Tue Sep 29 23:17:16 2015
Port Link Rx Rx Rx Rx Rx Rx Rx
State Crc Over Under Frag Jabber Align Lost
================================================================================
46 R 0 0 0 0 0 0 0

================================================================================
> indicates Port Display Name truncated past 8 characters
Link State: A-Active, R-Ready, NP-Port Not Present, L-Loopback
0->Clear Counters U->page up D->page down ESC->exit
Hi Patrick, thanks for your quick response. I think the show command shows nothing like the following.

X670V-48t.2 # show port 46 rxerror
Port Rx Error Monitor Tue Sep 29 23:17:16 2015
Port Link Rx Rx Rx Rx Rx Rx Rx
State Crc Over Under Frag Jabber Align Lost
================================================================================
46 R 0 0 0 0 0 0 0

================================================================================
> indicates Port Display Name truncated past 8 characters
Link State: A-Active, R-Ready, NP-Port Not Present, L-Loopback
0->Clear Counters U->page up D->page down ESC->exit
Userlevel 6
Hi Mary,

You are correct. I don't see any errors. The errors do reset after a switch reboot and if the port never became active it wouldn't receive any errors. Can you try taking a known working GBIC and replace the one having issues. Check both sides as well, this could be related to the device on the other side.
Patrick Voss wrote:

Hi Mary,

You are correct. I don't see any errors. The errors do reset after a switch reboot and if the port never became active it wouldn't receive any errors. Can you try taking a known working GBIC and replace the one having issues. Check both sides as well, this could be related to the device on the other side.

Hi Patrick,

I heard other teams in our company use the same GBIC with another extreme switch with no problem. I didn't reboot the switch, and still kept it at the 'ready' state when doing the show command. Is there some other command that I can try to get errors happened on those ports?

The device on the other side don't show any error and I don't need to reboot it to get the port back. You are right, I will also contact with them to see if they know about this issue. But for any reason the port on extreme switch got an error, I think it should restart or active itself, instead of in state 'ready' forever.

Thanks,
Mary
Userlevel 6
need to get light readings if you can.. I would guess you are also using type 3 MM patches?? We have found if there is a bad patch or a single high loss in one fiber it will also randomly go down and you have to pull and reseat the sfp+ to get the link back up... We have also seen that some of the SFP+ just don't electrically seat properly and can be a bit flakey to keep stable
EtherMAN wrote:

need to get light readings if you can.. I would guess you are also using type 3 MM patches?? We have found if there is a bad patch or a single high loss in one fiber it will also randomly go down and you have to pull and reseat the sfp+ to get the link back up... We have also seen that some of the SFP+ just don't electrically seat properly and can be a bit flakey to keep stable

Thanks for your reply. I'm using MM50/125 (OM3) fiber patches and SFP+ as stated above. I think it might be the issue, if the following experiment is correct: I loop back from port 45 to port 46 using the fiber patch, the link is down after a while with "local fault" in event log and no way to up again without a cold reboot.
Userlevel 6
Hi!

Also you can try to increase the value of debounce timer on ports 45-46.

Do you use ports 47-48 as alternate stack-ports? (when use, ports 45-46 should not be used as data ports - it belongs to X670, but was an issue with X670V too).

Thank you!
Alexandr P wrote:

Hi!

Also you can try to increase the value of debounce timer on ports 45-46.

Do you use ports 47-48 as alternate stack-ports? (when use, ports 45-46 should not be used as data ports - it belongs to X670, but was an issue with X670V too).

Thank you!

Hi AlexandrP, I'm not very familiar with stack ports, is it set by default? I'm not sure if we are using stack port or not.

I tried the follow command for setting debounce timer:
X670V-48t.1 # configure port 45 debounce time 4000 ^
%% Invalid input detected at '^' marker.

But the command is not working for my console.
Alexandr P wrote:

Hi!

Also you can try to increase the value of debounce timer on ports 45-46.

Do you use ports 47-48 as alternate stack-ports? (when use, ports 45-46 should not be used as data ports - it belongs to X670, but was an issue with X670V too).

Thank you!

I did a show stacking-support and find the following:
Stack Available PortsPort Native Alternate Configured Current
----- ----------------- ---------- ----------
1 No 47 Native N/A
2 No 48 Native N/A
stacking-support: Disabled N/A

Flags: * - Current stack port selection

Does that mean I'm using port 47 and 48 as alternate ports?
Userlevel 1
we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

Hi Chris,

I just tried your method, but it seems not working. I only got four transceivers, so I swaped them, connected the fiber and put them back.

What I noticed is that, on the remote side, if I plug in the transceiver and no fiber connected, I could see a beam of laser coming out from that trasceiver port. However, when the 10 gig port on the switch goes down, no matter what optic transceiver I plug in locally, there is no laser coming out until a reboot.

Thanks for your suggest. I'm contacting with the support team from extremes now.

-Mary
Userlevel 1
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

Please never....ever....look at the laser. if you are looking into any laser you can damage your eye. Please review your fiber safety documentation. You should never be able to see a laser with your bare eye,
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

Thank you! I will check it via a camera later.
Userlevel 6
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

Good call out Christopher. Thanks for keeping an eye on the big picture here...literally.
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

FYI, The phone camera works great as long as the camera does not have a filter built in.
christopher madison wrote:

we have had this issue in the past with our 10g ports going down with remote fault and will not clear until reboot. We ended up putting in RMA for our devices and haven't had the issues since. I can tell you that putting in a different sfp on both sides of the same type seems to clear the remote fault issue. step 1. remove sfps. step 2. put in different sfp on both sides. step 3. reinsert original sfp in both sides. step 4. connect fibers back. this has worked for me in the past as a work around to get things back on the air for a short period of time.

Thanks David!

Reply