I was trying to deploy spark-jobserver on a EMR cluster, as per this documentation "https://github.com/spark-jobserver/spark-jobserver/blob/master/doc/EMR.md#configure-master-box"
Was able to install the job-server on emr, but while starting the…
I have a spark job that runs every day as part of a pipeline and perform simple batch processing - let's say, adding a column to DF with other column's value squared. (old DF: x, new DF: x,x^2).
I also have a front app that consumes these 2…
How can I change the user of the context created in Spark Job Server?
I want to change the user which I am getting on sparkSession.sparkContext.sparkUser();
I have got a requirement to show the management/ Client that the executor-memory, number of cores, default parallelism, number of shuffle partitions and other configuration properties for running the spark job are not excessive or more than…
I have been doing a research about Configuring Spark JobServer Backend (SharedDb) with Cassandra.
And I saw in the SJS documentation that they cited Cassandra as one of the Shared DBs that can be used.
Here is the documentation part:
Spark…
I have added job-server 0.9.0 dependencies in build.sbt by add
scalaVersion := "2.11.0"
resolvers += "Job Server Bintray" at "https://dl.bintray.com/spark-jobserver/maven"
libraryDependencies ++= Seq(
"spark.jobserver" %% "job-server-api" %…
Im trying to catch up with the new SJS 0.9.0 in my application. Once after the context is created , I am trying to submit a job -> this happens
19/04/10 21:45:06 ERROR JobDAOActor: About to restart actor due to…
When I build context I used following parameters:
spark.cassandra.connection.host=somehosts&spark.cassandra.auth.username=app&spark.cassandra.auth.password=app
more detail as…
When I submit a job to the Spark job-server, I can see that the Spark context is created. However, there is an error in WebApi.getJobManagerForContext method:
[2018-06-11 07:05:24,495] INFO ocalContextSupervisorActor [] [] -
SparkContext…
I'm trying to call spark job server API from node js. The API which is the python egg file does provide the count of nulls from the file. So once I call the API from the node, it is reaching the SJS server and the job starts which triggers…
I am upgrading my server to spark 2.3.0 and job-server 0.8.1-SNAPSHOT from spark 2.1.1 and job-server 0.8.0 (which were working fine). I am using the JobSqlDao with MySql and am using the SessionContextFactory to create a sqlContext. In local.conf,…
I am working with Spark and Cassandra and in general things are straight forward and working as intended; in particular the spark-shell and running .scala processes to get results.
I'm now looking at utilisation of the Spark Job Server; I have the…
I have been looking for a way to get percentage of Job completed for the corresponding job id.
Right now, the Spark JobServer UI shows the corresponding status for a running job:
{
"duration": "Job not done yet",
"classPath":…
I am trying to build an application with spark job server API(for spark 2.2.0). But I found that there is no support for namedObject with sparkSession.
my looks like:
import com.typesafe.config.Config
import org.apache.spark.sql.SparkSession
import…