I am trying to write an RDD into S3 with server side encryption. Following is my piece of code.
val sparkConf = new SparkConf().
setMaster("local[*]").
setAppName("aws-encryption")
val sc = new SparkContext(sparkConf)
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", AWS_ACCESS_KEY)
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", AWS_SECRET_KEY)
sc.hadoopConfiguration.setBoolean("fs.s3n.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3n.enableServerSideEncryption", "true")
sc.hadoopConfiguration.setBoolean("fs.s3n.enableServerSideEncryption", true)
sc.hadoopConfiguration.set("fs.s3n.sse", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.serverSideEncryptionAlgorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.server-side-encryption-algorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3n.sse.kms.keyId", KMS_ID)
sc.hadoopConfiguration.set("fs.s3n.serverSideEncryptionKey", KMS_ID)
val rdd = sc.parallelize(Seq("one", "two", "three", "four"))
rdd.saveAsTextFile(s"s3n://$bucket/$objKey")
This code is writing RDD on S3 but without encryption. [I have checked properties of the written object and it shows server-side encrypted is "no".] Am I skipping anything here or using any property incorrectly?
Any suggestion would be appreciated.
P.S. I have set same properties with different name, reason being I am not sure when to use which name for e.g.
sc.hadoopConfiguration.setBoolean("fs.s3n.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3n.enableServerSideEncryption", "true")
sc.hadoopConfiguration.setBoolean("fs.s3n.enableServerSideEncryption", true)
Thank you.