2

We've tried a variety of solutions including changing the log4j.properties file, copying the file to the executors via --file and then telling them to use it as an arg passed to spark via --conf and also tried updating the configuration of the EMR cluster itself.

Warn messages from the system are visible in the executor logs. Warn messages from the main class are visible but no messages from any other classes are coming through the ether and we're not sure what the problem could be.

The logging level is fine, as shown by the message generated by Spark but the other class messages are not getting through.

The driver log (from the main page in EMR Console) is showing Debug messages from other classes, the executor logs (via the Spark UI Executor tab) are not.

Any help is hugely appreciated, thanks.

It's a streaming app running on spark 1.6. Below are some of the options we've tried.

Running the step normally: Arguments: spark-submit --deploy-mode client --master yarn --class main jarLoc

Extra spark logging config set at cluster configuration level:

{"classification":"spark-log4j", "properties":{"log4j.logger.MainClass$":"DEBUG", 
"log4j.logger.org.apache.spark":"WARN", "log4j.logger.org.apache.hadoop":"WARN", "log4j.logger.com.amazonaws.services":"WARN", "log4j.logger.com.companyName":"INFO", "log4j.logger.org.spark-project":"WARN"}, "configurations":[]}

Current log4j properties file:

log4j.rootLogger=INFO, STDOUT
log4j.logger.deng=DEBUG
log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
log4j.appender.STDOUT.layout.ConversionPattern=%d{yyyy-MM-dd hh:mm:ss} %t %x %-5p %-10c:%m%n

log4j.logger.MainClass$=DEBUG
log4j.logger.com.sessioncam=INFO
log4j.logger.org.apache.spark=WARN
log4j.logger.com.amazonaws.services=WARN
log4j.logger.org.spark-project=WARN
log4j.logger.org.apache.hadoop=WARN

Things I've tried:

spark-submit --deploy-mode client --master yarn --class MainClass--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/tmp/files/log4j.properties /tmp/files/jar.jar


Arguments: spark-submit --deploy-mode client --master yarn --class MainClass--files /tmp/files/log4j.properties /tmp/files/jar.jar
null
  • 3,469
  • 7
  • 41
  • 90
  • 1
    did you try : spark-submit --master yarn-cluster --files /path/to/log4j-spark.properties --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j-spark.properties" – Tal Joffe Dec 18 '16 at 14:14

1 Answers1

0

The logs end up in the local JVM's of the executor that is actually processing the task.

When you go to your Spark UI you are able to view the executors. When going to the stdout and stderr you are able to see the output of the actual executors

enter image description here

Hope this helps.

Fokko Driesprong
  • 2,075
  • 19
  • 31
  • I'm afraid not, these logs are the ones I was referring to when I mentioned accessing through the UI/Executor tab - the logs are fine except they do not contain the logging messages that should be generated. – null Sep 30 '16 at 09:39
  • did you end up figuring this out? running int he same issue – Nick Resnick Sep 10 '19 at 19:23