cancel
Showing results for 
Search instead for 
Did you mean: 

Network disruption VSP8600

Network disruption VSP8600

BRMS
New Contributor II

I’m using 4 VSP8600 in a SPBM-Configuration. Today we experienced a network disruption although no configuration changes were made. The logs are full of these:

************************************************************************************
Command Execution Time: Tue Oct 06 11:26:49 2020 CEST
************************************************************************************
1 2020-10-06T11:24:33.770+02:00 kreuz IO4 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:11 on VID 2422 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 8051
1 2020-10-06T11:24:30.533+02:00 kreuz IO2 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:36 on VID 426 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 16644
1 2020-10-06T11:22:35.621+02:00 kreuz IO3 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:2b on VID 419 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 13707

...

1 2020-10-06T10:42:31.049+02:00 kreuz IO5 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:28 on VID 205 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 9389
1 2020-10-06T10:42:31.035+02:00 kreuz IO6 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:28 on VID 205 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 631

...

1 2020-10-06T10:42:22.788+02:00 kreuz CP1 - 0x00004619 - 01900001 DYNAMIC CLEAR GlobalRouter SNMP INFO Smlt Link Up Trap(SmltId=133)
1 2020-10-06T10:42:22.788+02:00 kreuz CP1 - 0x0000000a - 01900001.133 DYNAMIC CLEAR GlobalRouter SW INFO SMLT 133 Link is UP
1 2020-10-06T10:42:17.262+02:00 kreuz CP1 - 0x0000461a - 01900001 DYNAMIC SET GlobalRouter SNMP INFO Smlt Link Down Trap(SmltId=133)
1 2020-10-06T10:42:17.261+02:00 kreuz CP1 - 0x00000009 - 01900001.133 DYNAMIC SET GlobalRouter SW INFO SMLT 133 Link is DOWN

...

1 2020-10-06T10:04:03.768+02:00 kreuz IO3 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:2b on VID 419 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 12529
1 2020-10-06T10:03:48.600+02:00 kreuz IO4 - 0x00138537 - 0004e001 DYNAMIC CLEAR GlobalRouter COP-SW INFO VIST peer mac b4:2d:56:9c:7a:11 on VID 2422 is learnt on non-IST MltId-1, Pointing record back to IST port.Total Peer Mac Move Count: 7347

What could be the cause of this problem and how can i debug this any further? my log only goes back for ~1hr

1 ACCEPTED SOLUTION

Miguel-Angel_RO
Valued Contributor II

brms,

 

I see the following points to be worked out:

  • change the isis metrics of your isis interfaces: the MLT should have the cost of the interfaces 1/1 or 1/5 divided by 2.
    • If it is 10G links: MLT=100, interfaces 1/1,1/5 = 200
  • I would enable SLPP on all the C-VLANs
  • Could you confirm the value of the i-sid used on the different switches for the vIST?
    • It should be uniq per cluster
  • I would change the subnet to /30 but this shouldn’t cause any issue using a /24
  • avoid the redistribution of the vIST subnets in ISIS/OSPF/other: https://gtacknowledge.extremenetworks.com/pkb_mobile#article/How_To/kA12T0000004QhGSAU/s
  • Ensure that you don’t use the vIST subnet for other purposes than VIST (not as next hop, not as SNMP access, etc)

Mig

View solution in original post

17 REPLIES 17

Miguel-Angel_RO
Valued Contributor II

brms,

 

I see the following points to be worked out:

  • change the isis metrics of your isis interfaces: the MLT should have the cost of the interfaces 1/1 or 1/5 divided by 2.
    • If it is 10G links: MLT=100, interfaces 1/1,1/5 = 200
  • I would enable SLPP on all the C-VLANs
  • Could you confirm the value of the i-sid used on the different switches for the vIST?
    • It should be uniq per cluster
  • I would change the subnet to /30 but this shouldn’t cause any issue using a /24
  • avoid the redistribution of the vIST subnets in ISIS/OSPF/other: https://gtacknowledge.extremenetworks.com/pkb_mobile#article/How_To/kA12T0000004QhGSAU/s
  • Ensure that you don’t use the vIST subnet for other purposes than VIST (not as next hop, not as SNMP access, etc)

Mig

BRMS
New Contributor II

since i had to resolve the immediate problems i rebooted the switches with the cli command “reset” just now. the one with the log-errors now shows me a coredump has been saved. does this indicate some hardware error?

BRMS
New Contributor II

in the past i had the same vlan for both vist because i didn’t know better. after i noticed log entries about the vlan i changed it on the VSPs which are making problems now from 4053 to 4054. can the subnet size be a problem? in the fdb tabel of these subnets i dont see other MACs aside of the VSPs.

problematic VSPs: show interface vlan 4054: https://pastebin.com/ek4VSnuc

working VSPs: show interfac vlan 4053: https://pastebin.com/mNtsfmaX

how can i disable route redistribution for the vist-vlan?

Miguel-Angel_RO
Valued Contributor II

brms,

 

I don’t like neither the fact that the subnet of the vist is distributed on the isis routing table nor the fact it is a /24.

I suppose that you use the same vlan id and i-sid on the four VSPs. Could you confirm?

Best practices is to have a /30 not redistributed in the routing table nd using a different i-sid.

Mig

BRMS
New Contributor II

VOSS 8600 doesnt support include 🙂

Here is the desired output:

show ip route on first vsp (pik). 172.28.72.1/2 are the vist-ips of the 2 VSPs at the other location:

172.28.72.0     255.255.255.0   karo                 GlobalRouter     10     4051     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   herz                 GlobalRouter     10     4051     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   karo                 GlobalRouter     10     4052     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   herz                 GlobalRouter     10     4052     ISIS 0   IBSE 7  
172.28.73.0     255.255.255.0   172.28.73.1          -                1      4054     LOC  0   DB   0 

show ip route on second vsp (kreuz: the one with the log entries):

172.28.72.0     255.255.255.0   karo                 GlobalRouter     10     4051     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   herz                 GlobalRouter     10     4051     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   karo                 GlobalRouter     10     4052     ISIS 0   IBSE 7  
172.28.72.0     255.255.255.0   herz                 GlobalRouter     10     4052     ISIS 0   IBSE 7  
172.28.73.0     255.255.255.0   172.28.73.2          -                1      4054     LOC  0   DB   0

 

show vlan i-sid on both vist-peers:

4054       104054

 

show isis spbm i-sid all on pik:

104054    0.3d.b3       4051   00db.face.0003       config       pik

104054    0.4d.b4       4052   00db.face.0004       discover   kreuz

show isis spbm i-sid all on kreuz:

104054    0.3d.b3       4051   00db.face.0003       discover   pik

104054    0.4d.b4       4052   00db.face.0004       config       kreuz

 

BTW: i forgot the mention. the problem only occurs at location 2, location 1 doesn’t show the same errors. the config should be identical as far as possible.

GTM-P2G8KFN