So another 7i question. 3 different devices, all on separate days, all in separate locations failed with the same issue in logs.
FW is the same on all three
Image : Extremeware Version 7.8.4.1 [non-ssh] [base] by Build_Master on 03/18/11 05:48:45
BootROM : 8.2
Lost comms to site, found port 32 (our uplink port to the WAN) having a good link, but tx/rx no traffic. Dis/enabled port, same. Even using and swapping KNOWN good
GBIC, cables, etc, ; port 32 was a doing no rx or tx. The WAN/LAN link is a point to point, and even though both sides said the link was up, unable to ping the /30 host IPS.
Cold reboot resolved.
The logs showed this (first sign of trouble in log was 9:43:11:72)
03/17/2017 09:43:12.12 00x80308950 verifyArp2FdbLinkage+1a4: dumpArpEntry ( 85775420 , ead9 , 1 , 1d )
03/17/2017 09:43:12.05 00x8030a9a8 fillInIpAddressSecurityTrapInfo+21c: verifyArp2FdbLinkage ( 10 , 8030a904 , 0 , 80e262a4 )
03/17/2017 09:43:11.99 00x80305a2c extremeArpFlush+f80: fillInIpAddressSecurityTrapInfo ( 80d3e3ac , 0 , 840576c0 , 34008d01 )
03/17/2017 09:43:11.94 00x80305b78 arptimer +110: 80305894 ( 18 , 80e262a4 , 80d00d24 , 80e262a4 )
03/17/2017 09:43:11.86 00x803b1c70 netTask +210: arptimer ( 81147c3c , ffffffff , eeeeeeee , eeeeeeee )
03/17/2017 09:43:11.72 Access Address : 0x803059db
03/17/2017 09:43:11.72 Cause Register: 0x00001014
03/17/2017 09:43:11.72 Status Register: 0x34008d01
03/17/2017 09:43:11.72 Exception Program Counter: 0x803062e4
03/17/2017 09:43:11.72 Address store Exception
03/17/2017 09:43:11.72 Task tNetTask(854af240) failed
03/17/2017 09:43:11.72 Chassis temperature is 25 degrees Centigrade
03/17/2017 09:43:11.72 Func arpFreeHandler Line 5470:
03/17/2017 09:43:11.72 Mismatch ARP ptr(0x2cd24<>0x855b6340) at FDB
We do see some of
Route IPFDB Handle != entries in the logs fyi on a normal day.
Next day diff site .....close to cookie cutter configs. Same symptoms, lose contact, onsite find uplink port is not rx or tx traffic, no matter what was used, port 32 was not registering any tx or rx traffic. Relevant logs:
1367209037 2017-03-20 12:39:58 PDT 10.96.208.6 EPICenter: SNMP Unreachable Device becomes unreachable. 1
1367199134 2017-03-20 11:41:11 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x80393540 macFdbRefreshArp+90 : verifyArp2FdbLinkage ( 69bb , 0 , 842bfe60 , c8 ) 1
1367199133 2017-03-20 11:41:11 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x80300b34 arpLearnSourceBinding+640: macFdbRefreshArp ( 23246c , 327a0001 , 84261c60 , 8545fad8 ) 1
1367199132 2017-03-20 11:41:11 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x8030198c processArpResponse+7bc: arpLearnSourceBinding ( a62d58e , 8545fbe0 , 80d3e438 , e0e0 ) 1
1367199131 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x80303bac arpinput +b34: processArpResponse ( 0 , 6e , 86f00538 , 80d4d2c0 ) 1
1367199130 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x800c3ef8 iparpFilter +2c8: arpinput ( 80728874 , fe2d00a1 , 7d5c3f , 30730201 ) 1
1367199129 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x800bc834 bridgeToCpu +2904: iparpFilter ( 80d38e3c , 86e6b5c0 , 85f4a2b8 , 86e44420 ) 1
1367199127 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x800bfde4 BGTask1_G2 +1320: bridgeToCpu ( eeeeeeee , eeeeeeee , eeeeeeee , e00806 ) 1
1367199126 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: 00x80d024ac vxTaskEntry +c : BGTask1_G2 ( 0 , 0 , 0 , 0 ) 1
1367199124 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Critical, Local7 Mar 20 11:40:10 KERN: inFlags=1b0 eType=806 offset=12 gMbuf=842bfe00 1
1367199123 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Critical, Local7 Mar 20 11:40:10 KERN: Line=4658 card=0 pt=0 chsub=e0 len=3c 1
1367199122 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Critical, Local7 Mar 20 11:40:10 KERN: Crash Dump Information for tBGTask 1
1367199121 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Task: 0x8545fe90 "tBGTask" 1
1367199120 2017-03-20 11:41:10 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Access Address : 0x8030605b 1
1367199119 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Cause Register: 0x00001014 1
1367199117 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Status Register: 0x34008d00 1
1367199116 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Exception Program Counter: 0x803062e4 1
1367199114 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Warning, Local7 Mar 20 11:40:10 SYST: Address store Exception 1
1367199113 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Critical, Local7 Mar 20 11:40:10 SYST: Task tBGTask(8545fe90) failed 1
1367199112 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Critical, Local7 Mar 20 11:40:10 SYST: Chassis temperature is 27 degrees Centigrade 1
1367199111 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Error, Local7 Mar 20 11:40:10 SYST: Func removeFromFdblist Line 1136: 1
1367199110 2017-03-20 11:41:09 PDT 10.96.208.6 Syslog: Error, Local7 Mar 20 11:40:10 SYST: Mismatch ARP ptr(0xc6c4<>0x84370ce0) at FDB 1
The third instance I do not have the logs handy, but same symptoms and same notes in logs.
What I find funny is that in all 3 instances the first sign of trouble in the logs was:
Mismatch ARP ptr(0xc6c4<>0x84370ce0) at FDB 1
All three devices have passed ext diags. Could over-saturation of some internal table be the cause of the three failures? Or some type of unique traffic pattern?