gPTP port roles

  • 0
  • 1
  • Question
  • Updated 12 months ago
  • Answered
I am using an X430-8p switch to connect two Altera Arria 10 kits running Linux PTP.  I am using L2 802.1AS (gPTP) and I'm wondering how the port role assignment should work in my case.  If I would like my one A10 kits to be GM, and the other one to be an OC slave, what should be the port roles?  I am able to manipulate priorities to get both A10 nodes as slaves to the switch, and in this case, the sync works very well (~20ns or less) but when I have one as master and one as slave, they seem to sync but very poorly with large path delays.

In this case, where I would like one A10 kit to be master and one as slave (through the switch), should both switch ports report as Passive gPTP port role?  Is there a way to force ports to passive role?

Thank you,
John
Photo of John Lemonovich

John Lemonovich

  • 120 Points 100 badge 2x thumb

Posted 12 months ago

  • 0
  • 1
Photo of Chad Smith

Chad Smith, Senior Escalation Support Engineer

  • 5,620 Points 5k badge 2x thumb
John,

If you want to ignore the Best Master Clock Algorithm (BMCA) and manually specify master and slave ports you need to issue the following commands:
configure network-clock gptp bmca off 
configure network-clock gptp slave-port PORT_TO_GMC
This will force the port connected to the GM to act as a slave port, all other ports will act as master ports.  You do not want ports in the passive mode in this scenario.

If you want to still use the BMCA then the device that you wish to be the GMC needs to advertise a lower priority value than the switch (default for switch is 246) or change the switch priority to be higher than the device:
configure network-clock gptp default-set priority1 PRIORITY
Photo of John Lemonovich

John Lemonovich

  • 120 Points 100 badge 2x thumb
Chad,

Thanks for the reply.  I have played with turning off BMCA - my board connected to port 3 is set to priority1=128 and board 2 connected to port 5 is priority1=255.  When I issue the commands you listed, indeed port 3 (connected to GMC) shows 3s, but port 5 will "bobble" between 5d/5m.  It's as if my slave is timing out and beginning to run the BMCA itself (maybe)?  It's confusing because the slave is not printing that it's selecting a "new foreign master".  It is printing as if they are sync'd - but the offset times are large and varying quite a bit.

Looking at netwrowk clock port 5 counters shows:
Port number                                      : 5
gPTP port status                                 : Enabled
------------------------------------------------------------
Parameter                             Receive       Transmit
------------------------------------------------------------
Announce                                    0          14012
Sync                                        0         118634
Follow Up                                   0         118632
Peer Delay Request                      20581          31525
Peer Delay Response                     20592          20581
Peer Delay Response Follow Up           20590          20581
gPTP packet discards                    10925              -
------------------------------------------------------------
Announce Receipt Timeout Count                   : 0
Sync Receipt Timeout Count                       : 0
Peer Delay Allowed Lost Response Exceeded Count  : 10823


Any ideas?

Thanks!

John
Photo of Chad Smith

Chad Smith, Senior Escalation Support Engineer

  • 5,620 Points 5k badge 2x thumb
Based on the counters, it seems like there may be an intermittent delay or total loss of response messages from the Alterra.  When the port is in Master state what is the normal propagation delay measurement? 

You can find that in:
show network-clock gptp port PORT
Could you try extending the threshold to see if it helps?
configure network-clock gptp ports PORT peer-delay neighbor-thresh THRESHOLD
Or maybe extend the allowed lost response?
configure network-clock gptp ports PORT peer-delay allowed-lost-responses COUNT
Ultimately, you probably need to figure out why the device isn't responding or why we think it isn't responding.
Photo of John Lemonovich

John Lemonovich

  • 120 Points 100 badge 2x thumb
Thanks for the reply.  I have tried increasing the allowed-lost-responses, but it hasn't helped.  I don't know that I tried the neighbor threshold though.

It seems as if it's the switch that doesn't like the messages being returned by my slave.  In any case, I thought I would try a different test.  I have connected one of my boards as GMC to just one switch port - eliminating my one A10 board, and a switch port.  The switch correctly reports it as 802.1AS capable, and labels it as a slave, but my master is printing constant sync and announce timeout messages.  I'm not sure why and I'm now installing wireshark on the embedded Linux/FPGA side.  
Would you expect the switch port in this case to simply work as an ordinary clock slave and remain sync'd with my master?  Is there any way to verify the time offsets or quality of the sync w/out a PPS output?
Photo of Chad Smith

Chad Smith, Senior Escalation Support Engineer

  • 5,620 Points 5k badge 2x thumb
John,

If you are the GMC why would you be sending Announce/Sync timeout messages?  Shouldn't you be sending the Announce and Sync messages as the GMC?  If i'm understanding correctly, it sounds as if the device may not be actually acting as GMC.

Unlike PTP, with gPTP we do not perform any local clock synchronization.  We measure propagation delays and we are aware of the offset from the GMC, but the local clock is not synchronized with the GMC.  There isn't a way to measure sync quality on the switch other than to monitor the measured peer propagation delays and the Cumulative Rate Ratio ("show network-clock gptp parent-set" will give you this).

It may be more efficient at this point to open up a case with GTAC
Photo of John Lemonovich

John Lemonovich

  • 120 Points 100 badge 2x thumb
It's not sending timeout messages over Ethernet, it's printing to the console that it's not receiving sync/announce messages, presumably as information during the BMCA.  It may be just because my debug log level is so high that it's printing anything and everything.  I can open a support case, I didn't know if maybe any users on here would have run into this scenario using Altera devices with LinuxPTP.
I am also now setting up a new GMC on a Linux PC so that I can just make my two A10 devices slaves, to see if that works well.  When I let the switch be master, both of my A10 devices sync as slaves and the performance is great ... sync'd to ~20ns or so on the PPS output.
Photo of Chad Smith

Chad Smith, Senior Escalation Support Engineer

  • 5,620 Points 5k badge 2x thumb
With BMCA off I don't think we will send Sync or Announce messages.  It really sounds as if the device isn't actually trying to act as GMC.  That could explain the behavior with the other slave as well.

Is it possible to directly connect the two devices (omitting the switch)?  If so, does everything work as expected?
Photo of John Lemonovich

John Lemonovich

  • 120 Points 100 badge 2x thumb
Yes with the two devices connected together (that was the starting point) - they work perfect...one as GMC one as slave.  I since have BMCA ON which explains some of the messages/timeouts I was seeing.  Yesterday I tried a test turning on one port at a time - I started with port 3 on with one of my devices as GMC.  The switch seemed to work fine as slave to my GMC.  I then turned on port 5 and booted my other device and my device reported as slave to the switch port 5 as master.  Then out of nowhere my two boards sync'd (one as master one as slave) as I have been trying, and the sync and path delays were excellent.  That worked for about 10 minutes and then something happened where the switch console was running painfully slow and I had to reboot.  That's where I left off... 
Photo of Chad Smith

Chad Smith, Senior Escalation Support Engineer

  • 5,620 Points 5k badge 2x thumb
Difficult to say what may have happened, obviously something abnormal.  We probably need to take a look at this via a GTAC case.