Feed Purview data into Splunk

  • 0
  • 1
  • Question
  • Updated 4 years ago
  • Answered
  • (Edited)
I've found the white paper on integrating Splunk with Purview, and it looks great, but I can't find any technical detail on how to get the data from Purview into Splunk. What's the process for bringing the data across?
Photo of James A

James A, Embassador

  • 6,982 Points 5k badge 2x thumb

Posted 4 years ago

  • 0
  • 1
Photo of Tamera Rousseau-Vesta

Tamera Rousseau-Vesta, Extreme Alumna

  • 2,760 Points 2k badge 2x thumb
Hi James, I am getting this answered for you so I apologize for the delay. 
Photo of Ferrer, Salvador

Ferrer, Salvador, Employee

  • 230 Points 100 badge 2x thumb
Hi James,

Apologies for the delay, Purview has a syslog exporter for flows, the configuration is hidden since it was an addon during the last development days before releasing.

in the file /opt/appid/conf/appid/appidconfig.xml you can add a line like <Syslog enabled="yes" priority="LOG_ERR" facility="LOG_DAEMON" Metadatadelimiter=" "/>

In purview, edit the file /etc/rsyslog.d/50-default.conf and insert a line like:

daemon.err              @<splunk_ip>

Then you can move to splunk and configure a syslog colector.

Another option that we are exploring is exporting the data stored in Netsight database to splunk using the JDBC conector in Splunk. That is still experimental since it involves manipulating the database in NetSight in ways that are not supported by GTAC, we will provide an update about this integration by the end of this quarter.

Photo of James A

James A, Embassador

  • 6,982 Points 5k badge 2x thumb
Hi Salvador, thanks for the info. I've just had a look in appidconfig.xml, and there's actually two lines:

  <Syslog enabled="no" priority="LOG_ERR" facility="LOG_DAEMON" metadataDelimiter="newline"/>
  <Splunk enabled="no" priority="LOG_ERR" facility="LOG_DAEMON"/>

Should I uncomment the splunk one then add the line to rsyslog.d?
Photo of Ferrer, Salvador

Ferrer, Salvador, Employee

  • 230 Points 100 badge 2x thumb
You should change the the metadata separator to something different than newline, e.g. a space with metadataDelimiter=" ", otherwise splunk will interpret the metadata lines as new events. Adding the line to the syslog config should be all you need. Afte changing appidconfig.conf you should restart purview with appidctl restart. I don't remember if you need to restart syslog after changing the rsyslog.d, if you changes are not immediately applied try restarting the syslog daemon (probably with service rsyslog restart, I don't remember either, I usually reboot the whole appliance but a service restart should suffice)
Photo of James A

James A, Embassador

  • 6,982 Points 5k badge 2x thumb
OK, I enabled the Splunk one, added the entry to rsyslod and now I'm getting data in Splunk :) Although I don't seem to be getting the User, Profile and Detailed location fields - are these added in NetSight rather than on Purview?
Photo of Ferrer, Salvador

Ferrer, Salvador, Employee

  • 230 Points 100 badge 2x thumb
Yep, those details are added by netsight and not available when the syslog events are created.

You can get more data about a flow using the aproach in my other post using web services to access flow information. You get more data about the flow using web services but it is also more challenging in terms of scalability and flow processing to avoid processing duplicated flows entries or missing flow entries.
Photo of Ferrer, Salvador

Ferrer, Salvador, Employee

  • 230 Points 100 badge 2x thumb
I forgot to mention another option: Splunk can query webservices using the application rest_ta in the Splunk applications store. If you point that data source to

"https://netsight_ip:8443/axis/service..."

(I've been trying hard to avoid the Hub to interpret the previous as a real http link and formating it its own way, but no luck. If you need to see the full format of the url above, right click on the link above, select copy link and paste it somewhere :( )

you get all flows in memory in purview with all their data. You will need to hack a bit with the coding of rest_ta application to process the data into Splunk but we can help you with that.

The problem with this approach is that every X seconds (the polling time in the rest_ta application) you get Y amount of flows (defined in the web service URL), irrespective if you already got them before which brings some challenges processing in Splunk duplicated flows, because you get them twice in successive webservices calls, or you may miss some flows because you didn't plan the number of flows to query and some of them have been aged before you issued the call.

Again, we are working finding solutions to these issues with specific Splunk configurations or redesigning the web service call to be published an update to the paper you already read
(Edited)
Photo of Tamera Rousseau-Vesta

Tamera Rousseau-Vesta, Extreme Alumna

  • 2,760 Points 2k badge 2x thumb
Hi James,  Please let us know if this answers your questions.  Thank you!!
Photo of James A

James A, Embassador

  • 6,982 Points 5k badge 2x thumb
It does, thank you :)
Photo of Tamera Rousseau-Vesta

Tamera Rousseau-Vesta, Extreme Alumna

  • 2,760 Points 2k badge 2x thumb
No, thanks for such a great topic!!