How would you determine a safe max threshold value for the max-jobs-per-context setting, which controls the number of concurrent Spark jobs that are running on a context? What would happen if you go too high? The default is set to 8 (see link…
I'm trying to deploy and connect instance of spark-jobserver in a docker container to BlueMix Spark service. Locally, container start perfectly with a command docker -d -p 8090:8090 {image-name}, but it looks like BlueMix ice -p command works…
I am trying to start the spark-job-server on my linux machine. I did the following:
Installed the cloudera distriution CDH(5.x) and gotten it up and running
Downloaded spark-job-server from the above mentioned github
Extracted the project into some…
I have a simple spark code in which I read a file using SparkContext.textFile() and then doing some operations on that data, and I am using spark-jobserver for getting output.
In code I am caching the data but after job ends and I execute that…
I am doing stress test on my spark application which uses spark cassandra connector as well as cassandra driver.
In my application , I am using cassandra driver to select the most recent value from the C* table.
This is working fine as long as the…
I am new to Spark world and Job Server
My Code :
package spark.jobserver
import java.nio.ByteBuffer
import scala.collection.JavaConversions._
import scala.collection.mutable.ListBuffer
import scala.collection.immutable.Map
import…
I have been trying spark using spark-shell. All my data is in sql.
I used to include external jars using the --jars flag like /bin/spark-shell --jars /path/to/mysql-connector-java-5.1.23-bin.jar --master spark://sparkmaster.com:7077
I have…
I have huge data (images) that uses machine learning model (CNN) to process image and gives results. As part of spark job performance, I'm trying to see internal spark (YARN) job flow. Spark UI shows list of Jobs, Stages - DAG, Executors and worker…
I have a spark application that contains multiple spark jobs to be run on Azure data bricks. I want to build and package the application into a fat jar. The application is able to compile successfully. While I am trying to package (command: sbt…
I am using Spark Job Server to submit spark jobs in cluster .The application I am trying to test is a
spark program based on Sansa query and Sansa stack . Sansa is used for scalable processing of huge amounts of RDF data and Sansa query is one of…
In sake of Spark low latency jobs, Spark Job Server provides a Persistent Context option. But I'm not sure, does persistent context contains metadata, block locations & any other information required for query planning?. By default Spark should read…
It looks like spark.jobserver.context.SQLContextFactory is deprecated. Could somebody help me with the example on how to run Spark SQL with the latest ( 0.8) version of SparkJobServer.
Thank you.
I have installed spark-observer v 0.8.0 with Scala 2.11. I am able to run examples from job-server-tests. However Im unable to find any examples related to running SQL ( load some file/create temporary table and then run SQL against this table). Do…
Is it possible to run spark-jobserver in windows without using any emulator like cygwin ? I have tried gitbash as well, as I thought it supports .sh files but I didn't get any luck.
Note : I have tried building the source code of spark-jobserver…
Getting java.lang.OutOfMemoryError: Java heap space in Spark job server logs and the job server goes down :
[2017-06-01 19:09:26,708] ERROR akka.actor.ActorSystemImpl [] [ActorSystem(JobServer)] - Uncaught error from thread…