1

I have a Tableau server (windows platform )which continuously generate logs. I want to stream it to apache spark for some real time analysis. I looked at following solution but none seems to satisfy the requirement.

1) using nxlog agent. This is not scalable, since in future logs may be coming from multiple tableau server.

2) fluented , flume are not compatible with windows.

3) Kafka is out of question , since it does not tail a log file.

what can be a scaleble solution to such problem? Major limitations are Tableau server runs on windows,

rusty
  • 652
  • 7
  • 21

1 Answers1

1

One option (that I would personally use) is to use http://logstash.net/ with Apache Kafka.

Searching for "logstash windows" on Google brings up a few tutorials.

Ruling out Kafka because it doesn't tail a log file doesn't really make a lot of sense. :)

Jon Bringhurst
  • 1,340
  • 1
  • 10
  • 21
  • Thank you. Actually I have to collect and send i) postgres database of tableau, ii) windows performance moniter csv file and iii)misinfo32, which is a text file in windows. so I have three sources and I want a stream of it to spark. – rusty Jan 12 '15 at 10:21
  • I went through this blog post and what it says about kafka is what i said as it does not tail a log file , you need to log through its api, which is not an option for me, I might have understood it wrong :) http://jasonwilder.com/blog/2013/07/16/centralized-logging-architecture/ – rusty Jan 12 '15 at 10:40
  • Hey @Rusty, logstash supports Kafka. You can use logstash to send logs to Kafka. – Jon Bringhurst Mar 24 '15 at 18:38