Questions tagged [spark-ui]

the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser

76 questions
0
votes
1 answer

Computing optimal Shuffle Partitions and mitigating Skew in Spark SQL Query

I work with Spark SQL v2.4.7 on EMR (with YARN). I write Spark Sql queries to perform transformations. Estimating the optimal Shuffle Partitions number for a complex query: I am trying to estimate the number of optimal shuffle partitions that needs…
marie20
  • 723
  • 11
  • 30
0
votes
1 answer

View Spark UI for Jobs executed via Azure ADF

I am not able to view the spark-ui for databricks jobs executed through notebook activity in Azure datafactory. Does anyone know which permissions needs to be added to enable the same?
mehere
  • 1,487
  • 5
  • 28
  • 50
0
votes
1 answer

trace expensive part of code in spark ui back to part of pyspark

I have some pyspark code with a very large number of joins and aggregation. I've enabled spark ui and I've been digging in to the event timeling, job stages, and dag visualization. I can find the task id and executor id for the expensive parts. …
user3476463
  • 3,967
  • 22
  • 57
  • 117
0
votes
0 answers

Why does only the show() operation show up in the spark ui?

I currently have a project using spark. For this project we are calculating some averages on a DataSet as follows: public void calculateAverages() { this.data.show(); String format = "HH"; // Get the dataset such that the time column…
K. Tilman
  • 41
  • 5
0
votes
1 answer

IBM BPM delay on load event

I want to make a button click after 10 seconds when the UI loads. I've tried below code in the "On load" event, but seems that the timer doesn't work: function myFunction() { me.click(); } setTimeout(myFunction, 10000); Any ideas how to trigger the…
0
votes
1 answer

spark tasks not starting to execute

i am running a job in spark shell job --num-executors 15 --driver-memory 15G --executor-memory 7G --executor-cores 8 --conf spark.yarn.executor.memoryOverhead=2G --conf spark.sql.shuffle.partitions=500 --conf…
prajwal rao
  • 87
  • 1
  • 9
0
votes
1 answer

Number of Tasks in Spark UI

I am new to Spark. I have couple of questions regarding the Spark Web UI:- I have seen that Spark can create multiple Jobs for the same application. On what basis does it creates the Jobs ? I understand Spark creates multiple Stages for a single…
Matthew
  • 315
  • 3
  • 5
  • 16
0
votes
1 answer

CF template to create sparkUI history server is failing

Default CF Template to create a history server includes creation of security group and IAM role. I removed both and added to select the existing security group. Now when I am running my CF template it is successfully creating the…
0
votes
2 answers

What is the difference between duration vs processing time vs batch duration in spark ui?

As the picture below, what's the difference between duration, batch duration and processing time in spark UI ? thanks Spark UI Picture
avseq
  • 39
  • 4
0
votes
0 answers

Spark createOrReplaceTempView cost or performance and other implications

I have the temp view being created in a loop. This temp view is used in subsequent queries. for row in manual_est_query_results_list: manual_est_query_results.createOrReplaceTempView("manual_estimates") Sometimes, the size of…
Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327
0
votes
0 answers

How to get AWS EMR SPARK UI using boto3

I am trying to use Spark UI in AWS EMR, without login in to the aws console. is there any way to access it using python program(boto3). I have all credentials and everything apart from console access. I went through all the materials in Google, but…
0
votes
1 answer

Optimization Spark job - Spark 2.1

my spark job currently runs in 59 mins. I want to optimize it so that I it takes less time. I have noticed that the last step of the job takes a lot of time (55 mins) (see the screenshots of the spark job in Spark UI below). I need to join a big…
Ali
  • 43
  • 2
  • 8
0
votes
0 answers

What are the blank spaces in my Spark UI event timeline?

I have a Spark batch application running on a YARN cluster (in AWS EMR). When I read the input to the application from S3 and write the output also to S3, the application takes a lot of time (nearly 6 minutes). I am guessing that this happens…
Harshit Sharma
  • 313
  • 4
  • 19
0
votes
1 answer

How to catch onchange event of radio button that is present in every row of single table in spark ui toolkit

Iam using spark ui table and I have radio button group (Yes and No)and teaxtarea in each row.I have multipe rows. My requirement is that if click on Yes ,then the textarea should be hidden only in that row.I wrote below code in load var table =…
user7350714
  • 365
  • 1
  • 6
  • 20
0
votes
0 answers

How can I monitor the tasks started with pyspark

I am using pyspark to run some tasks on a cluster. I want to see the status of the tasks. I think that the UI must be started by default as mentioned here. But I am unable to get UI (http://localhost:4040 or so).
Amit Teli
  • 875
  • 11
  • 25