application telemetry best practices, telemetry.pol enhanchements and some flow hints

  • 0
  • 2
  • Question
  • Updated 2 months ago
  • Answered
Hi all

I'm trying to understand how to better use the analytics stuff ...

I have several question,
and maybe some ideas ...

first of all: APP Telemetry
I see that everything seems to work thanks to:
- sflow
- a quite big policy on the switch
- the remote EAN mirror

with a costumer, I would like to have as much information as possible with a specific traffic ...
what I did till now is:
- I decreased the sflow sampling rate to the minimum of 256 on ports where I know there are the "interesting devices"
- I modified the telemetry.pol, to mirror to the "EAN mirror" ALL the traffic "from and to" those devices

am I moving in the right direction,
or I'm doing it in a completely wrong way?
if I mirror a lot of traffic to the "EAN mirror", this will increase the amount of information that the analytics software can get from the traffic, or is it completely useless (like ... the analytics software can handle only specific traffic signatures, and NOTHING out of that ... )

second: network response time vs application response time
I see I don't have those values for ALL the flows I have ... why?
I mean ... I suppose part of the "problem"  is related to sflow sampling rate ...
but what else?
limited signatures?
can I have those values for both UDP and TCP traffic?


third: core2 probe
I know that using a core2 probe, with all mirrored traffic,
I can have a lot of information about network traffic ...
but ...
is there any way to have a similar amount of information,
without the use of a core2 probe?
(I mean, also considering what I said in the previous "points")
What I don't understand is "WHY I always need a Core2 physical probe" ...
I can understand it because of "huge amount" of traffic,
but IF I need more infos about a very specific flow, or a very specific traffic,
why I can't have a "software" that is directly on the Analytics VM that can analyze that traffic ...
moreover now I have the GRE tunnel option, that is "super usefull"!! ...

please let me know what you thing

best regards

Stefano




 
Photo of Stefano Dall'Osto

Stefano Dall'Osto

  • 374 Points 250 badge 2x thumb

Posted 2 months ago

  • 0
  • 2
Photo of Mike Thomas

Mike Thomas, Employee - GTAC - NMS

  • 7,590 Points 5k badge 2x thumb
 

Hiall

I'm trying to understand how to better use theanalytics stuff ...

I have several question, 
and maybe some ideas ...

first of all: APP Telemetry
I see that everything seems to work thanks to:
- sflow
- a quite big policy on the switch
- the remote EAN mirror

 

<<Extreme>>Application Telemetry is a combination of two “traffic” feeds. The first is acollection of hardware based ACLs designed by the ExtremeAnalytics researchteam to isolate specific packets necessary for application detection, networkresponse time, and application response time. These ACL’d packets are thenredirected into a remote mirror and fed via the ERSPAN protocol to the DPIengine on the ExtremeAnalytics appliance. Sflow is the second of these twotraffic feeds and provides sampled flow data for the switch that allowsextrapolated bandwidth and top talkers. Sflow is also forwarded to theExtremeAnalytics engine where the flow data is combined with the results of theDPI processing on the raw packets within the ERSPAN. Both the ACLs and theSflow are configured for a given switch across all ports but only on ingress.  

with a costumer, I would like to have as muchinformation as possible with a specific traffic ...
what I did till now is:
- I decreased the sflow sampling rate to theminimum of 256 on ports where I know there are the "interestingdevices"

<<Extreme>>Application Telemetry is added from the Configuration tab of the Analyticsscreen within Management Center. Yes, the Management Center interface allowsconfiguring a particular switch as low as 1 out of 256 packets for the samplingrate. Our guidance is to limit that configuration option to less busy switchesand use a value of 1024 for highly utilized switches.
- I modified the telemetry.pol, to mirror to the"EAN mirror" ALL the traffic "from and to" those devices
<<Extreme>>You should avoid any alterations to the “telemetry.pol” file as this file hasbeen created by the ExtremeAnalytics research team and takes into account ACLhardware resources on the switches as well as what packets are required for ourunderlying fingerprints to properly detect application names and determineresponse times.

am I moving in the right direction, 
or I'm doing it in a completely wrong way?
if I mirror a lot of traffic to the "EANmirror", this will increase the amount of information that the analyticssoftware can get from the traffic, or is it completely useless (like ... theanalytics software can handle only specific traffic signatures, and NOTHING outof that ... )

<<Extreme>>All you need to do is use the Management Center interface to configureApplication Telemetry on the EXOS G2 switches. It sounds like you are trying toconfigure these manually and also making custom alterations to the ACL list.Manual configuration and changes to the ACLs should be avoided.

second: network response time vs applicationresponse time
I see I don't have those values for ALL theflows I have ... why?

<<Extreme>>Application Telemetry calculates network response time for all TCP flowsutilizing TCP SYN and TCP SYN ACK packets. Additionally, application responsetimes are largely calculated for HTTP/HTTPS flows where the application layerclient request and server response are isolated by the ACLs. Furthermore,application response times are calculated for critical UDP based networkprotocols such as DNS and DHCP as these protocols contain both a client requestand server response from which application response time can be calculated.  

I mean ... I suppose part of the"problem"  is related to sflow sampling rate ...
but what else?
limited signatures?
can I have those values for both UDP and TCPtraffic?

<<Extreme>>At present Application Telemetry supports the calculation of responses timesdetailed in the previous response. There are numerous underlying reasons forthis behavior, however, the most critical is that many UDP flows areunidirectional and there is no way to calculate either network or applicationresponse time. Think UDP based SYSLOG as an example of this type of traffic.


third: core2 probe
I know that using a core2 probe, with allmirrored traffic, 
I can have a lot of information about networktraffic ...

<<Extreme>>Coreflow2 produces similar results to Application Telemetry but does so withdifferent techniques. The Coreflow2 based switches produce line rate Netflowalong with a special hardware based mirror of just the first 15 packets of eachflow. The special mirror (often referred to as MirrorN) limits the amount ofraw packets that are fed to the Analytics DPI engine in a similar fashion tohow the hardware based ACLs limit the raw packets within Application Telemetry.

but ... 
is there any way to have a similar amount ofinformation, 
without the use of a core2 probe?

<<Extreme>>At present, Coreflow2 style Analytics requires a Coreflow2 based switch orExtremeWireless. Application Telemetry requires  an EXOS G2 based switchbut soon other Extreme switch lines will be supported.

(I mean, also considering what I said in theprevious "points")
What I don't understand is "WHY I alwaysneed a Core2 physical probe" ...

<<Extreme>>The engineering design for Coreflow2 Analytics requires the line rate Netflowin conjunction with the MirrorN raw packets.

I can understand it because of "hugeamount" of traffic, 
but IF I need more infos about a very specificflow, or a very specific traffic, 
why I can't have a "software" that isdirectly on the Analytics VM that can analyze that traffic ...

<<Extreme>>What you are describing is more of a debug feature which has value as aconcept.  However, ExtremeAnalytics is designed for capturing large swathsof data from across the network as its primary role. In this role theaggregated analytics allow for many varied business and networks decisions tobe made based on the collected analytics.

moreover now I have the GRE tunnel option, thatis "super usefull"!! ...

 

<<Extreme>>Coreflow2 Analytics supportsGRE tunnels for transferring raw mirrored packets across the network to theExtremeAnalytics appliance. Similarly,  Application Telemetry uses ERSPAN.Yes, both are quite useful for this transport functionality.
Photo of Stefano Dall'Osto

Stefano Dall'Osto

  • 374 Points 250 badge 2x thumb
thanks a lot for the reply ... very very useful :p

for what I understood,
with the ACL stuff on the switch I can get
- the application type
- network and response time

with sflow stuff I can get:
- bandwidth usage
- top talkers

please check the below pictures



in this you can see RTP, SIP, and RTCP traffic ...
I only have the application response time of the RTP stuff ...



in this second picture,
I can see similar traffic, plus an ssh session, with only the network response time ...



in this third picture, the last one for now, I can see the unidirectional traffic for the same client,
10.185.48.187,
with a not of application and network response time for the HTTP stuff,
and some application response time for RTP

what I would like to achieve is to have as much as possible for the network response time and application response time for that client,
10.185.48.187,
whose most critical traffic is SIP

why I don't see any application response time and network response time for SIP traffic? is because of the type of the traffic? is because I'm not getting all the needed information? is because there's no way to get those parameters in the actual configuration?
and why I'm getting only some network response time OR application response time, and NOT both of them for all the flows (SSH, RTP, also HTTP sometimes)?

please let me know what you think

thanks a lot

best regards

Stefano
Photo of Mike Thomas

Mike Thomas, Employee - GTAC - NMS

  • 7,590 Points 5k badge 2x thumb
Hi Stefano,
I think there may be a few things to consider. Regardless of CoreFlow2 or Application Telemetry, we don't always get the mirror traffic needed to support a response time, even with tcp based traffic. It looks like the majority of SIP and RTP traffic are typically using UDP as a flow and so are not picking them up.
It's also important to realize timing in some of these setups. An sflow sample will always report a payload, but if it is a not a new connection the traffic for the initial handshake may be missed, and the fingerprint may not show the correct traffic pattern.
Analytics in general is more purposed for long term reporting than it is short run flow detection. What you are looking at is the local flow collection off the Analytics appliance. Overtime it takes the important Top100 applications and Top100 users slow connections and builds reports based on that. Most of this flow data is temporary in nature, however it is useful for short term setup troubleshooting.