7

I have a pyspark streaming application that runs on yarn in a Hadoop cluster. The streaming application reads from a Kafka queue every n seconds and makes a REST call.

I have a logging service in place to provide an easy way to collect and store data, send data to Logstash and visualize data in Kibana. The data needs to conform to a template (JSON with specific keys) provided by this service.

I want to send logs from the streaming application to Logstash using this service. For this, I need to do two things:

- Collect some data while the streaming app is reading from Kafka and making the REST call. 
- Format it according to the logging service template.
- Forward the log to logstash host.

Any guidance related to this would be very helpful.

Thanks!

activelearner
  • 7,055
  • 20
  • 53
  • 94
  • What is the logging framework are you using? is it the one of Python? and do you manage to log from the driver and all the executors ? Another important question, what is the Master of your spark application? is it Local, Yarn, Mesos or Standalone? – user1314742 Apr 21 '17 at 08:44
  • @user1314742 I am trying to use the logging module from Python. The master of my spark application is Yarn and I want to run this job in a cluster mode. – activelearner Apr 21 '17 at 19:46

0 Answers0