the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser
Questions tagged [spark-ui]
76 questions
1
vote
0 answers
What is the difference between AWS EMRs "Elapsed Time" and Spark UI "Task Time"?
On EMR I see that my job took 12 minutes to run - according to the Elapsed Time column. However, when I go to the Spark UI > Executors tab the Task Time (GC Time) shows 1 hr (4 s).
I totalled up the times as well i could from the Event Timeline and…

tallwithknees
- 151
- 1
- 7
1
vote
0 answers
How to pass an environement variable to spark-defaults.conf
I want to run apache spark history on a docker image, to achieve this I had to change spark-defaults.conf and add this line
spark.history.fs.logDirectory /path/to/remote/logs
And then run start-history-server.sh
This work fine when I set the value…

svg_af_2
- 75
- 5
1
vote
1 answer
Apache Spark: How to detect data skew using Spark web UI
Data skew is something that hapen offen, that should be detected and treated correctly, I'm able to detect data skew in specific table using a groupby/count query in the joining key, however I have multiple joins in my application and doing that for…

svg_af_2
- 75
- 5
1
vote
0 answers
Spark SQL - EXPLAIN, DESCRIBE statements not shown in SparkUI
Lately realized the Spark SQL auxiliary statements (EXPLAIN, DESCRIBE, SHOW CREATE etc.,) not shown in Spark UI. I have an use-case to track all the queries executed through Spark SQL JDBC connection; just these statements go untracked.
So, my…

Kondasamy Jayaraman
- 1,802
- 1
- 20
- 25
1
vote
0 answers
Understanding Query Plan of a Spark SQL Query
I am trying to understand the Physical Plan of a Spark SQL query. I am using Spark SQL v 2.4.7.
Below is a partial query plan generated for a big query.
: +- ReusedQueryStage 16
: +- BroadcastQueryStage 7
: +- BroadcastExchange…

marie20
- 723
- 11
- 30
1
vote
0 answers
Cannot view the Spark UI using cloudformation stack
I want to enable the spark ui for my glue jobs. I followed Enabling the Spark UI for Jobs and Launching the Spark History Server, which I used default yml file provided by this guide to launch stack on cloudformation. After the stack was…

YvonneW
- 11
- 1
1
vote
1 answer
How to Export Jobs/Stages Logs from SparkUI of a Databricks Cluster
In databricls, I would like to export the jobs/stagesd logs that we see in the sparkui to a custom location for analysis. How can we do this?
Thanks.

SriramN
- 432
- 5
- 19
1
vote
1 answer
Why doesn't AWS Glue generate spark event logs
I have an AWS glue job with Spark UI enabled by following this instruction: Enabling the Spark UI for Jobs
The glue job has s3:* access to arn:aws:s3:::my-spark-event-bucket/* resource. But for some reason, when I run the glue job (and it…

pyspark-developer
- 57
- 6
1
vote
1 answer
Is there a more systematic way to resolve a slow AWS Glue + PySpark execution stage?
I have this code snippet that I ran locally in standalone mode using 100 records only:
from awsglue.context import GlueContext
glue_context = GlueContext(sc)
glue_df = glue_context.create_dynamic_frame.from_catalog(database=db, table_name=table)
df…

pyspark-developer
- 57
- 6
1
vote
1 answer
Why the total uptime in Spark UI is not equal to the sum of all job duration
I run a Spark Job and try to tune it faster. It is weird that the total uptime is 1.1 hours, but I add up all the job duration. It only takes 25 mins.
I'm curious about Why the total uptime in Spark UI is not equal to the sum of all job…

avseq
- 39
- 4
1
vote
0 answers
how to interpret the details graph for a stage in spark ui
I am seeing this details graph in the spark ui:
I have couple of questions regarding this graph:
1- Why Schedular delay and Task deserialization take so long compared to computing time? Does this mean something is wrong with job optimization (with…

honor
- 7,378
- 10
- 48
- 76
1
vote
0 answers
Spark UI is completely distorted
Whenever I am launching my spark application Spark Master UI is completely distorted and I am not able to navigate to SQL/Storage and other tabs
Tried in multiple browser but everytime it's the same.
Please let me know for any property related to…

Rakesh
- 21
- 6
1
vote
0 answers
Spark UI: How to balance processed data volume between the cores of the same executor
for a shuffle action, I see the data processed by the cores of the same executor is not balanced and of course the one takes the longest time will slow down the whole process time.
So I would like to know if it is possible to make some modification,…

mingzhao.pro
- 709
- 1
- 6
- 20
1
vote
1 answer
Spark Job UI - time/duration values under the name of a step
I have a simple question - what are the times at the top of WholeStageCodegen rectangles in the Spark UI? Is it a processing time?

mLC
- 663
- 10
- 22
1
vote
1 answer
Streaming tab is not showing for structured streaming
I am using structured streaming for reading csvs and writing to kafka. The streaming tab is not showing in Spark UI (not using streaming context).
val userSchema = new StructType().add("name", "string").add("age", "integer")
val csvDF = spark
…

sam8686
- 37
- 8