Analytics: Not all TCP flows shows Network or Applications Response time ?


Userlevel 6
PURVIEW / Analystics installation works fine till some TCP flows from some servers does not show Network or Applications Response time. Because the Netflow and Policy N-mirror sources are S4 core routers of 2 different locations we use a GRE Tunnel setup.

So why are not all Response times are shown? Maybe the policy n-mirror is not complete ?
Any other reasons ?

If i have a LAG to some servers i set it on the LAG logical port too? (Necessary ?)
i doublecheck the policy n mirror is set everywhere!

But how can i troubleshoot / debug or confirm this (incoming policy mirror to a specific flow) ? Any suggestions ?

Regards

8 replies

Userlevel 6
Hello Mattias,
It is possible that the mirrorN did not get the correct information for the flow to determine it's round trip time. Taking a tcpdump at the purview appliance and filter for the source and destination on gre1 port is shortest path for starting to figure it out.
It's important to note, that if you want to see round trip time, you want to have both netflow and mirrorN data traffic from the same port set. If it is a lag as a source port, then the lag should be specific in netflow as rx and in the policy section of the configuration as well.
Userlevel 6
Hi Mike,

that sounds perfect for troubleshooting and debugging!

Regards
Userlevel 6
One think i guess is that the GRE tunnel (between S8 and PV-Engine) is maybe overloaded.

I check port counters of the native port tg.4.20 (which connected S8 and PV Server) and the dummy port (for GRE Setup) ge.2.48 - no packets are discarded or error.

Can it be the the dummy port ge.2.48 (which is the GRE source port) limit the GRE tunnel bandwith to 1GB ? Or any other limitation OR does the GRE tunnel transport fully 10GB traffic if needed ?

Regards
Userlevel 6
It's not likely discarding. It may be more likely that internally, if switch packet processing spikes above 60 percent that a flow did not get created or mirrored. It's also possible in some instances to see on the other side, a purview appliance with high CPU (as measured via TOP) or if it is a virtual machine, the overall VM hardware is sometimes overloaded CPU's, and then problems can occur that way as well. So it VM are used it is a good idea to make adequate resources available.

However the most common explanation of your problem is the netflow data appears, but the corresponding mirrorN does not occur, at least bilaterally
Userlevel 6
Mike Thomas wrote:

It's not likely discarding. It may be more likely that internally, if switch packet processing spikes above 60 percent that a flow did not get created or mirrored. It's also possible in some instances to see on the other side, a purview appliance with high CPU (as measured via TOP) or if it is a virtual machine, the overall VM hardware is sometimes overloaded CPU's, and then problems can occur that way as well. So it VM are used it is a good idea to make adequate resources available.

However the most common explanation of your problem is the netflow data appears, but the corresponding mirrorN does not occur, at least bilaterally

Hi Mike,
thanks for your explanation!

But i want you asking again - is the configured speed of the GRE Source port (1GB) a bandwith limiter of the GRE Link? Or ask i an other way - does is the GRE Tunnel will be faster / thicker if i use a tg.x.y port as the source port?

Thanks
Userlevel 6
Mike Thomas wrote:

It's not likely discarding. It may be more likely that internally, if switch packet processing spikes above 60 percent that a flow did not get created or mirrored. It's also possible in some instances to see on the other side, a purview appliance with high CPU (as measured via TOP) or if it is a virtual machine, the overall VM hardware is sometimes overloaded CPU's, and then problems can occur that way as well. So it VM are used it is a good idea to make adequate resources available.

However the most common explanation of your problem is the netflow data appears, but the corresponding mirrorN does not occur, at least bilaterally

I have confirmed with development that the GRE Source port will affect output performance across the egress GRE tunnel. We have yet to see this as a limitation with most mirrorN setups, but it is possible. So a 10G port or a TBP on a PV-FC-180 may be recommended in these cases.
Userlevel 6
Mike Thomas wrote:

It's not likely discarding. It may be more likely that internally, if switch packet processing spikes above 60 percent that a flow did not get created or mirrored. It's also possible in some instances to see on the other side, a purview appliance with high CPU (as measured via TOP) or if it is a virtual machine, the overall VM hardware is sometimes overloaded CPU's, and then problems can occur that way as well. So it VM are used it is a good idea to make adequate resources available.

However the most common explanation of your problem is the netflow data appears, but the corresponding mirrorN does not occur, at least bilaterally

Hi Mike,
tbp port is my favorit - but we have a S8 (but S180 Series modules) and no PV-FC-180.
Do you think it will work with a tbp on S8 or have i sacrify a real tg.x.y port ?
Userlevel 6
Mike Thomas wrote:

It's not likely discarding. It may be more likely that internally, if switch packet processing spikes above 60 percent that a flow did not get created or mirrored. It's also possible in some instances to see on the other side, a purview appliance with high CPU (as measured via TOP) or if it is a virtual machine, the overall VM hardware is sometimes overloaded CPU's, and then problems can occur that way as well. So it VM are used it is a good idea to make adequate resources available.

However the most common explanation of your problem is the netflow data appears, but the corresponding mirrorN does not occur, at least bilaterally

There is no tbp possible on S8, only on the PV-FC-180, as it is internally using unused hardware. So you may need to sacrifice a 10G port if bandwidth is an issue.

Reply