1

I am trying to execute spark-submit in a java which also has some HAPI-FHIR libraried needed in the project. When I execute the job I get the following error

Error: Failed to load package.MainClass: org/hl7/fhir/instance/model/api/IAnyResource

I have already included the FHIR dependencies in pom.xml as shown FHIR dependencies

FHIR release : R4
HAPI FHIR version is : 5.4.2
PS : I am running it on an ec2 instance

Why So Serious
  • 129
  • 1
  • 9

2 Answers2

0

Importing a dependency in Maven provides it on the compile-time classpath so you can compile your code. It does not, by itself, package those dependencies with your code for deployment. I'm not sure what all tooling is available for Spark or best practices thereof, but I will guess that you need to package an "uber" (aka "shaded") jar that contains your compiled code AND the dependencies all within a single self-contained jar file that you can submit to spark.

There are several tools that can do this. The most common is probably the maven shade plugin: https://maven.apache.org/plugins/maven-shade-plugin/

crig
  • 859
  • 6
  • 19
  • In case that wasn't 100% clear, you'd add the shade plugin config to the build section of your project's POM file and re-build it with 'mvn package'. The resulting jar file should be larger indicating it worked and packaged the dependencies.. – crig Jul 20 '21 at 22:50
  • Hi Thanks for your response but the issue still persist after adding the shade plugin. – Why So Serious Jul 22 '21 at 05:08
  • I'm shooting in the dark here - you can examine the jar file to see what's in it. Could be the manifest does not specify your main class so Spark doesn't know the entry point? Maybe this will help? https://stackoverflow.com/questions/65169464/error-failed-to-load-class-main-using-spark-submit – crig Jul 22 '21 at 14:25
  • It looks to me like either Spark doesn't know the entry point and for whatever reason is trying to load one of the HAPI classes as the main. The class in question is just an interface though and definitely doesn't contain a main() method. This could be a problem with the jar manifest not specifying the entrypoint/main or the spark submit command. Or, the jar is missing the dependencies which was my first thought - but easy to check if the jar has the dependencies included. For the record, I don't think your problem has anything to do specifically with FHIR or the HAPI libraries. – crig Jul 22 '21 at 14:36
  • I tried running it without any HAPI libraries and it works just fine without any error. I feel like its a dependency issue but cannot figure out what exactly it is – Why So Serious Jul 23 '21 at 10:17
0

As @crig mentioned Spark could not find the jars in its class path. I tried creating a fat jar for all the jars but don't lnow why the fat jar could not be created However adding the jars to spark-submit --jars hapi-fhir-base-5.4.2 solved the problem

Why So Serious
  • 129
  • 1
  • 9