Questions tagged [spark-jobserver]

spark-jobserver provides a RESTful interface for submitting and managing Apache Spark jobs, jars, and job contexts.

Reference: https://github.com/spark-jobserver/spark-jobserver

RealTime Example: https://nishutayaltech.blogspot.com/2016/05/how-to-run-spark-job-server-and-spark.html

165 questions
0
votes
1 answer

reusable sparkcontext instance

I'm quite new to Big Data and currently, I'm working on a CLI project that performs some text parsing using apache spark. When a command is typed, a new sparkcontext is instantiated and some files are read from a hdfs instance. However, the spark is…
0
votes
1 answer

spark job server does not return a json in the correct format

case class Response(jobCompleted:String,detailedMessage:String) override def runJob(sc: HiveContext, runtime: JobEnvironment, data: JobData): JobOutput = { val generateResponse= new GenerateResponse(data,sc) val…
Siva
  • 21
  • 6
0
votes
1 answer

Spark Jobserver High Available

I have an standalone Spark cluster with few nodes. I was able to get it High Available with zookeeper. Im using Spark Jobserver spark-2.0-preview and I have configured the jobserver env1.conf file with the available spark URL's like…
0
votes
1 answer

Cast from case class Named RDD

I was sharing my RDDs between jobs with type of CassandraRow but I'm now joining several RDDs together so a case class makes more sense. I save my RDD as below & then retrieve it in a new job. This worked fine with type CassandraRow. CData is the…
ozzieisaacs
  • 833
  • 2
  • 11
  • 23
0
votes
1 answer

spark-jobserver: Worker does not connect back to the driver

I set up a small Spark environment on two machines. One runs a master and a worker, and the other one runs a worker only. I can use this cluster using the Spark Shell like: spark-shell --master spark://mymaster.example.internal:7077 I can run…
rabejens
  • 7,594
  • 11
  • 56
  • 104
0
votes
1 answer

Unable to start the deployed spark job server error org.slf4j.LoggerFactory not found

I am trying to use the spark job server on a cluster on CDH 5.11, spark version 1.6.0 When I try to start the spark jobserver on the deployed machine I get this error log [ERROR] [06/02/2017 15:30:14.966] [JobServer-akka.actor.default-dispatcher-3]…
0
votes
0 answers

Spark Jobserver fail just by receiving a job request

Jobserver 0.7.0 it have 4Gb ram available and 10Gb for the context, the system have 3 more free Gb. The context was running for a while and at the time when receive a request fails without any error. The request is the same like other ones that have…
0
votes
1 answer

Spark Job Server multithreading and dynamic allocation

I had pretty big expectations from Spark Job Server, but found out it critically lack of documentation. Could you please answer one/all of next questions: Does Spark Job Server submit jobs through Spark session? Is it possible to run few jobs in…
VB_
  • 45,112
  • 42
  • 145
  • 293
0
votes
1 answer

Split RDD into many RDDs and Cache

I have an rdd like so (aid, session, sessionnew, date) (55-BHA, 58, 15, 2017-05-09) (07-YET, 18, 5, 2017-05-09) (32-KXD, 27, 20, 2017-05-09) (19-OJD, 10, 1, 2017-05-09) (55-BHA, 1, 0, 2017-05-09) (55-BHA, 19, 3, 2017-05-09) (32-KXD, 787, 345,…
ozzieisaacs
  • 833
  • 2
  • 11
  • 23
0
votes
0 answers

ERROR .jobserver.JobManagerActor [] [] - About to restart actor due to exception: java.lang.NullPointerException

I got this error when I fallow this link: https://support.instaclustr.com/hc/en-us/articles/214473617-Getting-started-with-Spark-Jobserver-and-Instaclustr. I am trying to write a service that I can connect to Cassandra and Redis, but with the first…
Wilson Ho
  • 372
  • 5
  • 18
0
votes
0 answers

Spark Jobserver too slow respect Apache Spark Queries

Using Jobserver Spark-2.0-preview and Apache Spark 2.1.1 The execution time per query is not greater than 1s when I check on the spark UI, but I receive the response from jobserver after 10 seconds and much more. I'm querying parquet files and I'm…
0
votes
0 answers

How to keep log files for each job with created spark jobserver context

I used to run spark job server from server_start.sh, It comes with log files with default assigned context specified in log4j. However, when I ran the following command created context curl -d ""…
Tom
  • 83
  • 8
0
votes
1 answer

Did start spark job server properly and out of memory response

I am using spark-jobserver-0.6.2-spark-1.6.1 (1) export OBSERVER_CONFIG = /custom-spark-jobserver-config.yml (2)./server_start.sh Execution of the above start shell file returns without error. However, it created a pid file: spark-jobserver.pid When…
Tom
  • 83
  • 8
0
votes
0 answers

Implementing Checkpointing in Spark Streaming Job submitted using Spark Job Server

Implementing checkpointing when spark streaming job is diretcly submitted to spark seems straight forward . We are a facing quite some complexities when we need to the same when the streaming job is submitted using Spark Job server..any…
0
votes
1 answer

Job submission fails with generic 503 error when max-jobs-per-context value exceeds

I have set context-per-jvm = true and max-jobs-per-context = 4 in my config file. I have pre-created context using following curl command: curl -d "" 'http://jobserver:8090/contexts/test-context' I am using following curl command to submit a job:…
vatsal mevada
  • 5,148
  • 7
  • 39
  • 68