Questions tagged [spark-ui]

the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser

76 questions
1
vote
0 answers

What is the difference between AWS EMRs "Elapsed Time" and Spark UI "Task Time"?

On EMR I see that my job took 12 minutes to run - according to the Elapsed Time column. However, when I go to the Spark UI > Executors tab the Task Time (GC Time) shows 1 hr (4 s). I totalled up the times as well i could from the Event Timeline and…
tallwithknees
  • 151
  • 1
  • 7
1
vote
0 answers

How to pass an environement variable to spark-defaults.conf

I want to run apache spark history on a docker image, to achieve this I had to change spark-defaults.conf and add this line spark.history.fs.logDirectory /path/to/remote/logs And then run start-history-server.sh This work fine when I set the value…
1
vote
1 answer

Apache Spark: How to detect data skew using Spark web UI

Data skew is something that hapen offen, that should be detected and treated correctly, I'm able to detect data skew in specific table using a groupby/count query in the joining key, however I have multiple joins in my application and doing that for…
svg_af_2
  • 75
  • 5
1
vote
0 answers

Spark SQL - EXPLAIN, DESCRIBE statements not shown in SparkUI

Lately realized the Spark SQL auxiliary statements (EXPLAIN, DESCRIBE, SHOW CREATE etc.,) not shown in Spark UI. I have an use-case to track all the queries executed through Spark SQL JDBC connection; just these statements go untracked. So, my…
Kondasamy Jayaraman
  • 1,802
  • 1
  • 20
  • 25
1
vote
0 answers

Understanding Query Plan of a Spark SQL Query

I am trying to understand the Physical Plan of a Spark SQL query. I am using Spark SQL v 2.4.7. Below is a partial query plan generated for a big query. : +- ReusedQueryStage 16 : +- BroadcastQueryStage 7 : +- BroadcastExchange…
marie20
  • 723
  • 11
  • 30
1
vote
0 answers

Cannot view the Spark UI using cloudformation stack

I want to enable the spark ui for my glue jobs. I followed Enabling the Spark UI for Jobs and Launching the Spark History Server, which I used default yml file provided by this guide to launch stack on cloudformation. After the stack was…
YvonneW
  • 11
  • 1
1
vote
1 answer

How to Export Jobs/Stages Logs from SparkUI of a Databricks Cluster

In databricls, I would like to export the jobs/stagesd logs that we see in the sparkui to a custom location for analysis. How can we do this? Thanks.
SriramN
  • 432
  • 5
  • 19
1
vote
1 answer

Why doesn't AWS Glue generate spark event logs

I have an AWS glue job with Spark UI enabled by following this instruction: Enabling the Spark UI for Jobs The glue job has s3:* access to arn:aws:s3:::my-spark-event-bucket/* resource. But for some reason, when I run the glue job (and it…
1
vote
1 answer

Is there a more systematic way to resolve a slow AWS Glue + PySpark execution stage?

I have this code snippet that I ran locally in standalone mode using 100 records only: from awsglue.context import GlueContext glue_context = GlueContext(sc) glue_df = glue_context.create_dynamic_frame.from_catalog(database=db, table_name=table) df…
1
vote
1 answer

Why the total uptime in Spark UI is not equal to the sum of all job duration

I run a Spark Job and try to tune it faster. It is weird that the total uptime is 1.1 hours, but I add up all the job duration. It only takes 25 mins. I'm curious about Why the total uptime in Spark UI is not equal to the sum of all job…
avseq
  • 39
  • 4
1
vote
0 answers

how to interpret the details graph for a stage in spark ui

I am seeing this details graph in the spark ui: I have couple of questions regarding this graph: 1- Why Schedular delay and Task deserialization take so long compared to computing time? Does this mean something is wrong with job optimization (with…
honor
  • 7,378
  • 10
  • 48
  • 76
1
vote
0 answers

Spark UI is completely distorted

Whenever I am launching my spark application Spark Master UI is completely distorted and I am not able to navigate to SQL/Storage and other tabs Tried in multiple browser but everytime it's the same. Please let me know for any property related to…
Rakesh
  • 21
  • 6
1
vote
0 answers

Spark UI: How to balance processed data volume between the cores of the same executor

for a shuffle action, I see the data processed by the cores of the same executor is not balanced and of course the one takes the longest time will slow down the whole process time. So I would like to know if it is possible to make some modification,…
mingzhao.pro
  • 709
  • 1
  • 6
  • 20
1
vote
1 answer

Spark Job UI - time/duration values under the name of a step

I have a simple question - what are the times at the top of WholeStageCodegen rectangles in the Spark UI? Is it a processing time?
mLC
  • 663
  • 10
  • 22
1
vote
1 answer

Streaming tab is not showing for structured streaming

I am using structured streaming for reading csvs and writing to kafka. The streaming tab is not showing in Spark UI (not using streaming context). val userSchema = new StructType().add("name", "string").add("age", "integer") val csvDF = spark …