2

I have a project that uses both spark and hadoop-aws (for resolving s3a in hadoop 2.6, I think a lot of project uses this configuration). However, they have severe conflict in transitive dependency. Namely spark 1.3.1 uses jackson-databind 2.4.4, and hadoop-aws for hadoop 2.6 uses jackson-databind 2.2.3, and the worst thing is that: they both refuse to run on each other's version, API of jackson has changed a lot within 2 major upgrades.

I know I can manually append hadoop-aws jar only in deployment phase and avoid using it in compilation/testing/packaging. But this seems to be an 'inelegant' solution - the best practice for software engineering is to let maven handle everything, and test all features before shipping. Is there a maven configuration that allows me to do this?

tribbloid
  • 4,026
  • 14
  • 64
  • 103
  • You can add exclusions to your dependencies in Maven. Is this what you are looking for? See https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html – Kulu Limpa May 21 '15 at 02:46
  • So how is this going to work at runtime? You have 2 copies of class Foo.Bar.A ... how does the classloader pick which one to use? The first one found on disk will be used by both callers. You would need to go through some kind of class rename where one Foo.Bar.A gets renamed Foo2.Bar.A... but then if anyone uses reflection, you would be hosed – bwawok May 21 '15 at 12:48
  • Thanks a lot bwawok. Unfortunately combining hadoop/spark & s3a is a very common case and the recommended practice to use s3 after hadoop 2.7 so it has to be circumvented somehow. – tribbloid May 21 '15 at 22:19

0 Answers0