1

I'm running a spark job on EMR but need to create a checkpoint. I tried using s3 but got this error message

17/02/24 14:34:35 ERROR ApplicationMaster: User class threw exception: 
java.lang.IllegalArgumentException: Wrong FS: s3://spark-
jobs/checkpoint/31d57e4f-dbd8-4a50-ba60-0ab1d5b7b14d/connected-
components-e3210fd6/2, expected: hdfs://ip-172-18-13-18.ec2.internal:8020
java.lang.IllegalArgumentException: Wrong FS: s3://spark-
jobs/checkpoint/31d57e4f-dbd8-4a50-ba60-0ab1d5b7b14d/connected-
components-e3210fd6/2, expected: hdfs://ip-172-18-13-18.ec2.internal:8020

Here is my sample code

...
val sparkConf = new SparkConf().setAppName("spark-job")
  .set("spark.default.parallelism", (CPU * 3).toString)
  .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
  .registerKryoClasses(Array(classOf[Member], classOf[GraphVertex], classOf[GraphEdge]))
  .set("spark.dynamicAllocation.enabled", "true")


implicit val sparkSession = SparkSession.builder().config(sparkConf).getOrCreate()

sparkSession.sparkContext.setCheckpointDir("s3://spark-jobs/checkpoint")
....

How can I checkpoint on AWS EMR?

Philip K. Adetiloye
  • 3,102
  • 4
  • 37
  • 63

3 Answers3

2

There's a now fixed bug for Spark which meant you could only checkpoint to the default FS, not any other one (like S3). It's fixed in master, don't know about backports.

if it makes you feel any better, the way checkpointing works: write then rename() is slow enough on the object store you may find yourself off better checkpointing locally then doing the upload to s3 yourself.

stevel
  • 12,567
  • 1
  • 39
  • 50
1

There is a fix in the master branch for this to allow checkpoint to s3 too. I was able to build against it and it worked so this should be part of next release.

Philip K. Adetiloye
  • 3,102
  • 4
  • 37
  • 63
  • 1
    good to hear. Do add a reasonably long interval between checkpoints though as it takes so long to save...rebuilding it from infrequent checkpoints could be faster, and you will have less of a perf hit from the checkpoint itself. YMMV – stevel Feb 28 '17 at 13:10
0

Try something with AWS authenticaton like:

val hadoopConf: Configuration = new Configuration()
  hadoopConf.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
  hadoopConf.set("fs.s3n.awsAccessKeyId", "id-1")
  hadoopConf.set("fs.s3n.awsSecretAccessKey", "secret-key")

  sparkSession.sparkContext.getOrCreate(checkPointDir, () => 
      { createStreamingContext(checkPointDir, config) }, hadoopConf)
FaigB
  • 2,271
  • 1
  • 13
  • 22