The environment:
- Hadoop: 2.5.3.0-37
- Spark: 1.6.2
- Scala: 2.10.5
- Java: 1.8
Quick summary: the fat jar spark-assembly-1.6.2.2.5.3.0-37-hadoop2.7.3.2.5.3.0-37.jar includes class-files from the BouncyCastle jar-file, and strips BouncyCastle's signature; as a result, the BouncyCastleProvider cannot be used as a codec because that logic expects the containing jar-file to be verified with the right signature.
Stack trace:
java.security.NoSuchProviderException: JCE cannot authenticate the provider BC
at javax.crypto.JceSecurity.getInstance(JceSecurity.java:100)
at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:204)
at ai.by247.MainJob$.scrapeLogs(MainJob.scala:57)
at ai.by247.MainJob$.main(MainJob.scala:26)
at ai.by247.MainJob.main(MainJob.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.jar.JarException: file:/hdp/2.5.3.0-37/spark/lib/spark-assembly-1.6.2.2.5.3.0-37-hadoop2.7.3.2.5.3.0-37.jar has unsigned entries - org/apache/spark/SparkConf$$anonfun$5.class
at javax.crypto.JarVerifier.verifySingleJar(JarVerifier.java:500)
at javax.crypto.JarVerifier.verifyJars(JarVerifier.java:361)
at javax.crypto.JarVerifier.verify(JarVerifier.java:289)
at javax.crypto.JceSecurity.verifyProviderJar(JceSecurity.java:159)
at javax.crypto.JceSecurity.getVerificationResult(JceSecurity.java:185)
at javax.crypto.JceSecurity.getInstance(JceSecurity.java:97)
... 13 more
The Scala logic that triggers this error is simple:
Security.insertProviderAt(new BouncyCastleProvider(), 1)
SecretKeyFactory.getInstance("PBEWITHSHA256AND256BITAES-CBC-BC", "BC")
After reading several articles, I've attempted to work-around the issue with various configuration options, so far to no avail. Eg:
--conf spark.driver.extraJavaOptions="-Djava.security.properties=file:/path/to/my.java.security -Djava.security.policy=file:/path/to/my.security.policy"
where my.java.security is:
security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider
and my.security.policy is:
grant {
// There is no restriction to any algorithms.
permission javax.crypto.CryptoAllPermission;
};
Possibly I could resolve this issue by coercing spark-submit to prioritize the signed version of the BouncyCastle jar-file in the classpath for both the driver and executor logic. It's not clear from the docs I've read that that's possible.
More generally: I'm suspicious that including BouncyCastle or the contents of any other signed jar-file into the spark-assembly-hadoop.jar is simply an error. That is, if compiling a fat jar strips the signature of included signed jar-files, then there's a question in mind whether there's any utility to that logic being in the fat jar at all. (I realize this version of Spark, 1.6.2, is a legacy release, so maybe this issue has been already addressed in more recent snapshots.)
How can I get around this issue?