the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser
Questions tagged [spark-ui]
76 questions
0
votes
1 answer
Computing optimal Shuffle Partitions and mitigating Skew in Spark SQL Query
I work with Spark SQL v2.4.7 on EMR (with YARN). I write Spark Sql queries to perform transformations.
Estimating the optimal Shuffle Partitions number for a complex query:
I am trying to estimate the number of optimal shuffle partitions that needs…

marie20
- 723
- 11
- 30
0
votes
1 answer
View Spark UI for Jobs executed via Azure ADF
I am not able to view the spark-ui for databricks jobs executed through notebook activity in Azure datafactory.
Does anyone know which permissions needs to be added to enable the same?

mehere
- 1,487
- 5
- 28
- 50
0
votes
1 answer
trace expensive part of code in spark ui back to part of pyspark
I have some pyspark code with a very large number of joins and aggregation. I've enabled spark ui and I've been digging in to the event timeling, job stages, and dag visualization. I can find the task id and executor id for the expensive parts. …

user3476463
- 3,967
- 22
- 57
- 117
0
votes
0 answers
Why does only the show() operation show up in the spark ui?
I currently have a project using spark. For this project we are calculating some averages on a DataSet as follows:
public void calculateAverages() {
this.data.show();
String format = "HH";
// Get the dataset such that the time column…

K. Tilman
- 41
- 5
0
votes
1 answer
IBM BPM delay on load event
I want to make a button click after 10 seconds when the UI loads. I've tried below code in the "On load" event, but seems that the timer doesn't work:
function myFunction() {
me.click(); }
setTimeout(myFunction, 10000);
Any ideas how to trigger the…

gromazazzz
- 9
- 5
0
votes
1 answer
spark tasks not starting to execute
i am running a job in spark shell job
--num-executors 15
--driver-memory 15G
--executor-memory 7G
--executor-cores 8
--conf spark.yarn.executor.memoryOverhead=2G
--conf spark.sql.shuffle.partitions=500
--conf…

prajwal rao
- 87
- 1
- 9
0
votes
1 answer
Number of Tasks in Spark UI
I am new to Spark. I have couple of questions regarding the Spark Web UI:-
I have seen that Spark can create multiple Jobs for the same
application. On what basis does it creates the Jobs ?
I understand Spark creates multiple Stages for a single…

Matthew
- 315
- 3
- 5
- 16
0
votes
1 answer
CF template to create sparkUI history server is failing
Default CF Template to create a history server includes creation of security group and IAM role.
I removed both and added to select the existing security group.
Now when I am running my CF template it is successfully creating the…

Ankur Rathore
- 133
- 1
- 1
- 5
0
votes
2 answers
What is the difference between duration vs processing time vs batch duration in spark ui?
As the picture below, what's the difference between duration, batch duration and processing time in spark UI ?
thanks
Spark UI Picture

avseq
- 39
- 4
0
votes
0 answers
Spark createOrReplaceTempView cost or performance and other implications
I have the temp view being created in a loop. This temp view is used in subsequent queries.
for row in manual_est_query_results_list:
manual_est_query_results.createOrReplaceTempView("manual_estimates")
Sometimes, the size of…

Aravind Yarram
- 78,777
- 46
- 231
- 327
0
votes
0 answers
How to get AWS EMR SPARK UI using boto3
I am trying to use Spark UI in AWS EMR, without login in to the aws console. is there any way to access it using python program(boto3).
I have all credentials and everything apart from console access.
I went through all the materials in Google, but…

karthick karmegam
- 13
- 3
0
votes
1 answer
Optimization Spark job - Spark 2.1
my spark job currently runs in 59 mins. I want to optimize it so that I it takes less time. I have noticed that the last step of the job takes a lot of time (55 mins) (see the screenshots of the spark job in Spark UI below).
I need to join a big…

Ali
- 43
- 2
- 8
0
votes
0 answers
What are the blank spaces in my Spark UI event timeline?
I have a Spark batch application running on a YARN cluster (in AWS EMR). When I read the input to the application from S3 and write the output also to S3, the application takes a lot of time (nearly 6 minutes). I am guessing that this happens…

Harshit Sharma
- 313
- 4
- 19
0
votes
1 answer
How to catch onchange event of radio button that is present in every row of single table in spark ui toolkit
Iam using spark ui table and I have radio button group (Yes and No)and teaxtarea in each row.I have multipe rows.
My requirement is that if click on Yes ,then the textarea should be hidden only in that row.I wrote below code in load
var table =…

user7350714
- 365
- 1
- 6
- 20
0
votes
0 answers
How can I monitor the tasks started with pyspark
I am using pyspark to run some tasks on a cluster.
I want to see the status of the tasks.
I think that the UI must be started by default
as mentioned here.
But I am unable to get UI (http://localhost:4040 or so).

Amit Teli
- 875
- 11
- 25