0

I'm using AWS Glue to run some pyspark python code, sometimes it succeeded but sometimes failed with a dependency error: Resource Setup Error: Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: JohnSnowLabs#spark-nlp;2.5.4: not found], here is the error logs:

:: problems summary ::
:::: WARNINGS
        module not found: JohnSnowLabs#spark-nlp;2.5.4

    ==== local-m2-cache: tried

      file:/root/.m2/repository/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      file:/root/.m2/repository/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

    ==== local-ivy-cache: tried

      /root/.ivy2/local/JohnSnowLabs/spark-nlp/2.5.4/ivys/ivy.xml

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      /root/.ivy2/local/JohnSnowLabs/spark-nlp/2.5.4/jars/spark-nlp.jar

    ==== central: tried

      https://repo1.maven.org/maven2/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      https://repo1.maven.org/maven2/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

    ==== spark-packages: tried

      https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

        ::::::::::::::::::::::::::::::::::::::::::::::

        ::          UNRESOLVED DEPENDENCIES         ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: JohnSnowLabs#spark-nlp;2.5.4: not found

        ::::::::::::::::::::::::::::::::::::::::::::::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: JohnSnowLabs#spark-nlp;2.5.4: not found]

From the logs of a successful run, I can see that glue was able to download the dependency from https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom, where the failed job has also tried to download from, but failed.

This issue seems to resolve itself last week, but in the last couple of days it showed up again, and hasn't resolved itself so far. Has anyone ever seen this weird issue? Thanks.

wawawa
  • 2,835
  • 6
  • 44
  • 105

1 Answers1

1

spark-packages moved on May 1 2021. In my scala project I had to add a different resolver like so. It's got to be similar in java.

resolvers in ThisBuild ++= Seq(
  "SparkPackages" at "https://repos.spark-packages.org"
 ## remove -> "MVNRepository"  at "https://dl.bintray.com/spark-packages/maven"
)

Go look yourself, that package isn't on that resolver you were looking for. Mine wasn't either.

https://dl.bintray.com/spark-packages/

Tony Fraser
  • 727
  • 7
  • 14
  • 1
    The link doesn't allow directory browsing, I can't help with that, but it's there b/c I use it too. Think of a resolver as url for a place where all your packages are, all the maven stuff you include in your build but don't write yourself. You tell your build system, maven, sbt, whatever, where all your packages are and they include it and magically downloads them. Third party packages for spark (non apache written packages) are all moved off bintray last week to this other place called sparkpackages. It was a _huge_ deal for all us spark guys last week. Not sure about glue, sorry. – Tony Fraser May 10 '21 at 18:57
  • Hi thanks for the answe, just wondering do you know where I can download this jar 'JohnSnowLabs:spark-nlp:2.5.4'? Couldn't find it anywhere ... – wawawa May 13 '21 at 12:04
  • The correct and latest place to find all Spark NLP related artifacts is on Maven here: https://mvnrepository.com/artifact/com.johnsnowlabs.nlp (`JohnSnowLabs#spark-nlp` seems like spark packages which was deprecated in Spark NLP - stopped publishing there and chose Maven as main repo) – Maziyar Jan 03 '23 at 14:59