Highest Voted 'spark-ui' Questions

3

votes

2 answers

Understanding Event Timeline of Spark UI

I've a job running which shows the Event Timeline as follows, I am trying to guess the gaps between these single lines, they seem to be parallel but not immediately sequencial with other stages... Any other insight from this, and what is the cluster…

apache-spark pyspark spark-ui

asked Jun 15 '18 at 09:20

Aakash Basu

1,689
7
28
57

2

votes

1 answer

How to find out when Spark Application has any memory/disk spills without checking Spark UI

My Environment: Databricks 10.4 Pyspark I'm looking into Spark performance and looking specifically into memory/disk spills that are available in Spark UI - Stage section. What I want to achieve is to get notified if my job had spills. I have…

apache-spark pyspark databricks azure-databricks spark-ui

asked Feb 03 '23 at 08:43

BI Dude

1,842
5
37
67

2

votes

1 answer

Pyspark monitoring metrics not making sense

I am trying to understand the spark ui and hdfs ui while using pyspark. Following are my properties for the Session that I am running pyspark --master yarn --num-executors 4 --executor-memory 6G --executor-cores 3 --conf…

apache-spark pyspark google-cloud-dataproc spark-ui

asked Dec 08 '22 at 12:12

figs_and_nuts

4,870
2
31
56

2

votes

2 answers

How to increase Jetty's header buffer size in the Spark UI reverse proxy

I'm getting "HTTP ERROR 502 Bad Gateway" when I click on a worker link in my standalone Spark UI. Looking at the master logs I can see a corresponding message... HttpSenderOverHTTP.java:219 Generated headers (4096 bytes), chunk (-1 bytes), content…

apache-spark jetty embedded-jetty jetty-9 spark-ui

asked Jan 21 '22 at 15:09

Martin Stone

12,682
2
39
53

2

votes

1 answer

Why do I see two jobs in Spark UI for a single read?

I am trying to run the below script to load file with 24k records. Is there any reason why I am seeing two jobs for single load in Spark UI. code from pyspark.sql import SparkSession spark = SparkSession\ .builder\ .appName("DM")\ …

apache-spark pyspark spark-ui

asked Aug 05 '21 at 05:29

user16344431

2

votes

1 answer

How can I get DAG of Spark Sql Query execution plan?

I am doing some analysis on spark sql query execution plans. the execution plans that explain() api prints are not much readable. If we see spark web UI, a DAG graph is created which is divided into jobs, stages and tasks and much more readable. Is…

apache-spark pyspark apache-spark-sql explain spark-ui

asked Oct 02 '20 at 13:13

akash patel

163
9

2

votes

0 answers

Why executor memory used is shown greater than total available memory on spark web UI?

I have a spark structured streaming job that is running for around last 3 weeks. When I open the Executors tab on spark web UI, it shows memory used - 36.1GB total available memory for storage - 3.2GB For this application executor memory is set…

apache-spark spark-structured-streaming spark-ui spark-webui

asked May 13 '20 at 07:30

Ravi

23
7

2

votes

2 answers

Refusing to display LOCALHOST in a frame because 'X-Frame-Options' set to 'sameorigin'

This question specifically regards the localhost. I am trying to embed a localhost web page in another localhost web page however it states that this cannot be done. This was the message in chrome developer tools: Refused to display…

google-chrome firefox x-frame-options websecurity spark-ui

asked Mar 22 '20 at 00:15

Umer

21
2
5

2

votes

1 answer

Spark UI -> SQL tab doesn't show all (older) stages

I am executing a spark (sql) job which has lots of stages (~150). It is written using spark-sql primarily within an internal framework that chains the SQL's using temporary views and dataframes. For initial intermediate table writes, I can see a…

apache-spark apache-spark-sql spark-ui

asked Dec 26 '19 at 13:41

sujit

2,258
1
15
24

2

votes

2 answers

What is 'Active Jobs' in Spark History Server Spark UI Jobs section

I'm trying to understand Spark History server components. I know that, History server shows completed Spark applications. Nonetheless, I see 'Active Jobs' set to 1 for a completed Spark application. I'm trying to understand what is 'Active Jobs'…

apache-spark cloudera spark-ui

asked Sep 01 '18 at 07:45

Ash

1,180
3
22
36

2

votes

1 answer

Spark local mode: How to query the number of executor slots?

I'm following tutorial Using Apache Spark 2.0 to Analyze the City of San Francisco's Open Data where it's claimed that the "local mode" Spark cluster available in Databricks "Community Edition" provides you with 3 executor slots. (So 3 Tasks should…

apache-spark pyspark databricks spark-ui

asked Mar 16 '18 at 15:45

das-g

9,718
4
38
80

1

vote

1 answer

Pyspark get max value of column of a csv the quickest way possible

I am trying to get the max value of a column using this: df.agg(max(col('some_integer_column')),min(col('some_integer_column'))) The df is a csv file. Which I know if it was a parquet/delta it would be much easier and faster. As the csv file needs…

python pyspark databricks distributed-computing spark-ui

asked Jul 16 '23 at 19:29

Eugenio.Gastelum96

164
1
13

1

vote

0 answers

WholeStageCodegen’s min duration larger than the query duration

I found in the spark ui, the duration of the min time of the WholeStageCodegen part larger than the duration of the query. I think that does not make sense right? Now, I want to examine the where does those total, min, max were calculated in the…

scala apache-spark spark-ui

asked Jul 04 '23 at 17:53

user17118231

37
4

1

vote

0 answers

Spark Yarn Error during closing SparkContext

My Spark application works in a Yarn Hadoop cluster. After completing its tasks and attempting to close the SparkContext, my application encounters an error: 2023-06-05 12:30:43,361 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode:…

apache-spark hadoop hadoop-yarn spark-ui

asked Jun 05 '23 at 14:43

Alexander Lopatin

560
6
18

1

vote

0 answers

I have a code running in a GCP cluster and I am trying to connect it to the spark UI but it says it cannot connect to port 8080?

bind [::1]:8080: Cannot assign requested address Linux data-eng-m 5.10.0-0.deb10.16-amd64 #1 SMP Debian 5.10.127-2~bpo10+1 (2022-07-28) x86_64 This is the error that I keep getting. I created my application in my notebook running in the cluster but…

apache-spark google-cloud-platform google-cloud-dataproc spark-ui

asked May 11 '23 at 03:34

marce lozano

11
1

Questions tagged [spark-ui]