Logging and Debuging on Qubole

Question

How does one log on Qubole/access logs from spark on Qubole? The setup I have:

java library (JAR)
Zeppelin Notebook (Scala), simply calling a method from the library
Spark, Yarn cluster
Log4j2 used in the library (configured to log on stdout)

How can I access my logs from the log4j2 logger? What I tried so far:

Looking into the 'Logs' section of my Interpreters
Going through Spark UI's stdout logs of each executor

score 0 · Answer 1 · answered May 28 '20 at 16:24

When a Spark job or application fails, you can use the Spark logs to analyze the failures.

The QDS UI provides links to the logs in the Application UI and Spark Application UI.

If you are running the Spark job or application from the Analyze page, you can access the logs via the Application UI and Spark Application UI.

If you are running the Spark job or application from the Notebooks page, you can access the logs via the Spark Application UI.

Accessing the Application UI Accessing the Spark Application UI You can also additional logs to identify the errors and exceptions in Spark job or application failures.

Accessing the Application UI To access the logs via the Application UI from the Analyze page of the QDS UI:

Note the command id, which is unique to the Qubole job or command. Click on the down arrow on the right of the search bar. The Search History page appears as shown in the following figure.

../../_images/spark-debug1.png Enter the command id in the Command Id field and click Apply.

Logs of any Spark job are displayed in Application UI and Spark Application UI, which are accessible in the Logs and Resources tabs. The information in these UIs can be used to trace any information related to command status.

The following figure shows an example of Logs tab with links.

Click on the Application UI hyperlink in the Logs tab or Resources tab.

The Hadoop MR application UI is displayed as shown in the following figure.

../../_images/application-ui.png The Hadoop MR application UI displays the following information:

MR application master logs Total Mapper/Reducer tasks Completed/Failed/Killed/Successful tasks Note

The MR application master logs corresponds to the Spark driver logs. For any Spark driver related issues, you should verify the AM logs (driver logs).

If you want to check the exceptions of the failed jobs, you can click on the logs link in the Hadoop MR application UI page. The Application Master (AM) logs page that contains stdout, stderr and syslog is displayed.

Accessing the Spark Application UI You can access the logs by using the Spark Application UI from the Analyze page and Notebooks page.

From the Analyze page From the Home menu, navigate to the Analyze page. Note the command id, which is unique to the Qubole job or command. Click on the down arrow on the right of the search bar. The Search History page appears as shown in the following figure. ../../_images/spark-debug1.png Enter the command id in the Command Id field and click Apply. Click on the Logs tab or Resources tab. Click on the Spark Application UI hyperlink. From the Notebooks page From the Home menu, navigate to the Notebooks page.

Click on the Spark widget on the top right and click on Spark UI as shown in the following figure.

../../_images/spark-ui.png OR

Click on the i icon in the paragraph as shown in the following figure.

../../_images/spark-debug2.png When you open the Spark UI from the Spark widget of the Notebooks page or from the Analyze page, the Spark Application UI is displayed in a separate tab as shown in the following figure.

../../_images/spark-application-ui.png The Spark Application UI displays the following information:

Jobs: The Jobs tab shows the total number of completed, succeeded and failed jobs. It also shows the number of stages that a job has succeeded.

Stages: The Stages tab shows the total number of completed and failed stages. If you want to check more details about the failed stages, click on the failed stage in the Description column. The details of the failed stages are displayed as shown in the following figure.

../../_images/spark-app-stage.png The Errors column shows the detailed error message for the failed tasks. You should note the executor id and the hostname to view details in the container logs. For more details about the error stack trace, you should check the container logs.

Storage: The Storage tab displayed the cached data if caching is enabled.

Environment : The Environment tab shows the information about JVM, Spark properties, System properties and classpath entries which helps to know the values for a property that is used by the spark cluster during runtime. The following figure shows the Environment tab.

../../_images/spark-app-env.png Executors : The Executors tab shows the container logs. You can map the container logs using the executor id and the hostname, which is displayed in the Stages tab.

Spark on Qubole provides the following additional fields in the Executors tab:

Resident size/Container size: Displays the total physical memory used within the container (which is the executor’s java heap + off heap memory) as Resident size, and the configured yarn container size (which is executor memory + executor overhead) as Container size. Heap used/committed/max: Displays values corresponding to the executor’s java heap. The following figure shows the Executors tab.

../../_images/spark-app-exec.png The Logs column in shows the links to the container logs. Additionally, the number of tasks executed by each executor with number of active, failed, completed and total tasks are displayed.

Note

For debugging container memory issues, you can check the statistics on container size, Heap used, the input size, and shuffle read/write.

Feedback Accessing Additional Spark Logs Apart from accessing the logs from the QDS UI, you can also access the following logs, which reside on the cluster, to identify the errors and exceptions in Spark jobs failures:

and it contains the Spark event logs. Spark History Server Logs: The spark-yarn-org.apache.spark.deploy.history.HistoryServer-1-localhost.localdomain.log files are stored at /media/ephemeral0/logs/spark. The Spark history server logs are stored only on the master node of the cluster.

Spark Event Logs: The Spark eventlog files are stored at /logs/hadoop///spark-eventlogs where:

scheme is the Cloud-specific URI scheme: s3:// for AWS; wasb:// or adl:// or abfs[s] for Azure; oci:// for Oracle OCI. defloc is the default storage location for the QDS account. cluster_id is the cluster ID as shown on the Clusters page of the QDS UI. cluster_inst_id is the cluster instance ID. You should contact Qubole Support to obtain the cluster instance ID.

Logging and Debuging on Qubole

1 Answers1