1

The environment:

  • Hadoop: 2.5.3.0-37
  • Spark: 1.6.2
  • Scala: 2.10.5
  • Java: 1.8

Quick summary: the fat jar spark-assembly-1.6.2.2.5.3.0-37-hadoop2.7.3.2.5.3.0-37.jar includes class-files from the BouncyCastle jar-file, and strips BouncyCastle's signature; as a result, the BouncyCastleProvider cannot be used as a codec because that logic expects the containing jar-file to be verified with the right signature.

Stack trace:

java.security.NoSuchProviderException: JCE cannot authenticate the provider BC
        at javax.crypto.JceSecurity.getInstance(JceSecurity.java:100)
        at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:204)
        at ai.by247.MainJob$.scrapeLogs(MainJob.scala:57)
        at ai.by247.MainJob$.main(MainJob.scala:26)
        at ai.by247.MainJob.main(MainJob.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.jar.JarException: file:/hdp/2.5.3.0-37/spark/lib/spark-assembly-1.6.2.2.5.3.0-37-hadoop2.7.3.2.5.3.0-37.jar has unsigned entries - org/apache/spark/SparkConf$$anonfun$5.class
        at javax.crypto.JarVerifier.verifySingleJar(JarVerifier.java:500)
        at javax.crypto.JarVerifier.verifyJars(JarVerifier.java:361)
        at javax.crypto.JarVerifier.verify(JarVerifier.java:289)
        at javax.crypto.JceSecurity.verifyProviderJar(JceSecurity.java:159)
        at javax.crypto.JceSecurity.getVerificationResult(JceSecurity.java:185)
        at javax.crypto.JceSecurity.getInstance(JceSecurity.java:97)
        ... 13 more

The Scala logic that triggers this error is simple:

    Security.insertProviderAt(new BouncyCastleProvider(), 1)
    SecretKeyFactory.getInstance("PBEWITHSHA256AND256BITAES-CBC-BC", "BC")

After reading several articles, I've attempted to work-around the issue with various configuration options, so far to no avail. Eg:

--conf spark.driver.extraJavaOptions="-Djava.security.properties=file:/path/to/my.java.security -Djava.security.policy=file:/path/to/my.security.policy"

where my.java.security is:

security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider

and my.security.policy is:

grant {
    // There is no restriction to any algorithms.
    permission javax.crypto.CryptoAllPermission; 
};

Possibly I could resolve this issue by coercing spark-submit to prioritize the signed version of the BouncyCastle jar-file in the classpath for both the driver and executor logic. It's not clear from the docs I've read that that's possible.

More generally: I'm suspicious that including BouncyCastle or the contents of any other signed jar-file into the spark-assembly-hadoop.jar is simply an error. That is, if compiling a fat jar strips the signature of included signed jar-files, then there's a question in mind whether there's any utility to that logic being in the fat jar at all. (I realize this version of Spark, 1.6.2, is a legacy release, so maybe this issue has been already addressed in more recent snapshots.)

How can I get around this issue?

Community
  • 1
  • 1
Kode Charlie
  • 1,297
  • 16
  • 32

0 Answers0