Questions tagged [spark-jobserver]

spark-jobserver provides a RESTful interface for submitting and managing Apache Spark jobs, jars, and job contexts.

Reference: https://github.com/spark-jobserver/spark-jobserver

RealTime Example: https://nishutayaltech.blogspot.com/2016/05/how-to-run-spark-job-server-and-spark.html

165 questions
3
votes
0 answers

Spark Jobserver - stress test - Async POST error response: akka.pattern.AskTimeoutException

Hi I am trying to do stress test on the spark job server, and I am sharing the spark context with the following properties among the submitted jobs. spark.executor.cores='2' spark.cores.max='1' spark.driver.cores='1' spark.driver.memory='1g'…
karthi
  • 31
  • 2
3
votes
0 answers

spark jobserver job crash

I'm running a job using spark jobserver (takes +-10min). The job randomly crash during its execution (around 1 time on 2) with the following exception on the executor : ERROR 2016-10-13 19:22:58,617 Logging.scala:95 -…
Quentin
  • 3,150
  • 4
  • 24
  • 34
3
votes
2 answers

Writing Parquet file in standalone mode works..multiple worker mode fails

In Spark, version 1.6.1 (code is in Scala 2.10), I am trying to write a data frame to a Parquet file: import sc.implicits._ val triples = file.map(p => _parse(p, " ", true)).toDF()…
brecht-d-m
  • 371
  • 1
  • 5
  • 15
3
votes
2 answers

Submitting Spark Jobs to Spark Cluster

I am a complete novice in Spark and just started exploring more on this. I have chosen the longer path by not installing hadoop using any CDH distribution and i have installed Hadoop from Apache website and setting the config file myself to…
3
votes
0 answers

How to process REPL generated class files in Spark by using parallel running Scala interpreters?

In my company we are currently using Spark interpreter to generate dynamically class files with spark-jobserver. Those class files are generated on our Spark cluster driver and saved into the directory (on that driver) defined by using…
3
votes
2 answers

Building spark-jobserver Using SBT and Scala

Can anyone suggest me a better documentation about spark-jobserver. I have gone through the url spark-jobserver but unable to follow the same. It will be great if some one explain step by step instruction on how to use spark-jobserver. Tools used…
2
votes
2 answers

getting timeout when submitting fat jar to spark-jobserver (akka.pattern.AskTimeoutException)

I have built my job jar using sbt assembly to have all dependencies in one jar. When I try to submit my binary to spark-jobserver I am getting akka.pattern.AskTimeoutException I modified my configuration to be able to submit large jars (I added…
2
votes
1 answer

spark jobserver failing to build with Spark 2.0

I am trying to run spark-jobserver with spark-2.0 I cloned branch spark-2.0-preview from github repository. I follow the deployment guide but when I try to deploy server using bin/server_deploy.sh. I got compilation error: Error: [error]…
2
votes
0 answers

Spark Jobserver: Very large task size

I'm getting messages along the lines of the following in my Spark JobServer logs: Stage 14 contains a task of very large size (9523 KB). The maximum recommended task size is 100 KB. I'm creating my RDD with this code: List data = new…
yarrichar
  • 423
  • 5
  • 17
2
votes
0 answers

Apache Spark with Spark JobServer crash after some hours

I'm using Apache Spark 2.0.2 together with Apache JobServer 0.7.0. I know this is not a best practice but this is a first step. My server have 52 Gb RAM and 6 CPU Cores, Cent OS 7 x64, Java(TM) SE Runtime Environment (build 1.7.0_79-b15) and it have…
2
votes
1 answer

Spark JobServer, memory settings for release

I've set up a spark-jobserver to enable complex queries on a reduced dataset. The jobserver executes two operations: Sync with the main remote database, it makes a dump of some of the server's tables, reduce and aggregates the data, save the result…
Marco Fedele
  • 2,090
  • 2
  • 25
  • 45
2
votes
1 answer

How do I configure the FAIR scheduler with Spark-Jobserver?

When I post simultaneous jobserver requests, they always seem to be processed in FIFO mode. This is despite my best efforts to enable the FAIR scheduler. How can I ensure that my requests are always processed in parallel? Background: On my cluster…
Graham S
  • 1,642
  • 10
  • 12
2
votes
0 answers

Parallel execution of spark job using job-server

I'm using spark cluster in standalone mode + spark job-server for my written in Scala jobs execution. I launched job-server in docker container: docker run -d -p 8090:8090 -e SPARK_MASTER=spark://spark-server:7077…
Cortwave
  • 4,747
  • 2
  • 25
  • 42
2
votes
1 answer

Schedule Automatic Spark Jobs in Spark Job server every hour

In DataStax Enterprise Edition 4.8 , Spark Jobserver 0.5.2 has been specially compiled against the supported version of Apache Spark 1.4.1.1. Spark job will read data from Cassandra and write summarized data into another table in same Keyspace. Is…
Sid
  • 23
  • 3
2
votes
0 answers

Increase the Query Parallelism Capacity on Cached RDD (DataFrame) with Spark-Job-Server on a Standalone Spark Cluster

First of all, our standalone Spark cluster consists of 20 nodes, each one of them has 40 cores and 128G memory (including the 2 masters). 1. We use Spark-Job-Server for reusing Spark-Context (in the core, we want to reuse cached RDD for querying),…
Tao Huang
  • 21
  • 1
1
2
3
10 11