0

I am using GeoSpark 1.3.1 where I am trying to find all geo points that are contained in a circle, given a center and radius in meters. To do this I wan't to translate the center from degree to meter, create the circle (using ST_Buffer) and then transform that returned polygon back to degree, before applying an ST_Contains function in a join with all the geo points. Please see SQL below:

WITH point_data AS (
 SELECT
   ST_Point(CAST(c.lon as Decimal(24,20)), CAST(c.lat as Decimal(24,20))) as geo_point
 FROM point_data_view as c
)
SELECT * FROM point_data as pd
WHERE ST_Contains(ST_Transform(ST_Buffer(ST_Transform(ST_Point(<LON>, <LAT>), 'epsg:4326', 'epsg:3857'), 1000.0), 'epsg:3857', 'epsg:4326'), pd.geo_point) = true

However when I follow the guide GeoSpark adding the dependencies to my pom file and creating an uber jar that is submit (with spark2-submit) I get the following error (only when using ST_Transform function)

java.lang.NoSuchMethodError: org.hsqldb.DatabaseURL.parseURL(Ljava/lang/String;ZZ)Lorg/hsqldb/persist/HsqlProperties;
        at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source)
        at org.hsqldb.jdbc.JDBCDataSource.getConnection(Unknown Source)
        at org.hsqldb.jdbc.JDBCDataSource.getConnection(Unknown Source)
        at org.geotools.referencing.factory.epsg.DirectEpsgFactory.getConnection(DirectEpsgFactory.java:3302)
        at org.geotools.referencing.factory.epsg.ThreadedEpsgFactory.createBackingStore(ThreadedEpsgFactory.java:436)
        at org.geotools.referencing.factory.DeferredAuthorityFactory.getBackingStore(DeferredAuthorityFactory.java:133)
        at org.geotools.referencing.factory.BufferedAuthorityFactory.isAvailable(BufferedAuthorityFactory.java:235)
        at org.geotools.referencing.factory.DeferredAuthorityFactory.isAvailable(DeferredAuthorityFactory.java:119)
        at org.geotools.factory.FactoryRegistry.isAvailable(FactoryRegistry.java:667)
        at org.geotools.factory.FactoryRegistry.isAcceptable(FactoryRegistry.java:501)
        at org.geotools.factory.FactoryRegistry.getServiceImplementation(FactoryRegistry.java:437)
        at org.geotools.factory.FactoryRegistry.getServiceProvider(FactoryRegistry.java:365)
        at org.geotools.factory.FactoryCreator.getServiceProvider(FactoryCreator.java:145)
        at org.geotools.referencing.ReferencingFactoryFinder.getAuthorityFactory(ReferencingFactoryFinder.java:220)
        at org.geotools.referencing.ReferencingFactoryFinder.getCRSAuthorityFactory(ReferencingFactoryFinder.java:440)
        at org.geotools.referencing.factory.epsg.LongitudeFirstFactory.createBackingStore(LongitudeFirstFactory.java:192)
        at org.geotools.referencing.factory.DeferredAuthorityFactory.getBackingStore(DeferredAuthorityFactory.java:133)
        at org.geotools.referencing.factory.BufferedAuthorityFactory.isAvailable(BufferedAuthorityFactory.java:235)
        at org.geotools.referencing.factory.DeferredAuthorityFactory.isAvailable(DeferredAuthorityFactory.java:119)
        at org.geotools.factory.FactoryRegistry.isAvailable(FactoryRegistry.java:667)
        at org.geotools.factory.FactoryRegistry.isAcceptable(FactoryRegistry.java:501)
        at org.geotools.factory.FactoryRegistry$1.filter(FactoryRegistry.java:192)
        at javax.imageio.spi.FilterIterator.advance(ServiceRegistry.java:834)
        at javax.imageio.spi.FilterIterator.<init>(ServiceRegistry.java:828)
        at javax.imageio.spi.ServiceRegistry.getServiceProviders(ServiceRegistry.java:519)
        at org.geotools.factory.FactoryRegistry.getServiceProviders(FactoryRegistry.java:197)
        at org.geotools.referencing.ReferencingFactoryFinder.getFactories(ReferencingFactoryFinder.java:180)
        at org.geotools.referencing.ReferencingFactoryFinder.getCRSAuthorityFactories(ReferencingFactoryFinder.java:455)
        at org.geotools.referencing.DefaultAuthorityFactory.getBackingFactory(DefaultAuthorityFactory.java:89)
        at org.geotools.referencing.DefaultAuthorityFactory.<init>(DefaultAuthorityFactory.java:69)
        at org.geotools.referencing.CRS.getAuthorityFactory(CRS.java:263)
        at org.geotools.referencing.CRS.decode(CRS.java:525)
        at org.geotools.referencing.CRS.decode(CRS.java:453)
        at org.apache.spark.sql.geosparksql.expressions.ST_Transform.eval(Functions.scala:237)

I have tried to shade and relocate the "org.hsqldb" in my build, but that don't change anything, and by not including the GeoSpark jars in the uber jar, but load them as part of the spark2-submit I get the same error. I can't really find a way around this issues, and it only happens if I use the ST_Transform method?

It looks like the my Spark platform has an older version of the org.hsqldb than the one given from GeoSpark!

My Shading looks like this:

<properties>
    <geoSparkVersion>1.3.1</geoSparkVersion>
    <sparkVersion>2.3.0</sparkVersion>
    <clouderaPackage>cloudera2</clouderaPackage>
    <scalaVersion>2.11.0</scalaVersion>
    <scalaBinaryVersion>2.11</scalaBinaryVersion>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_${scalaBinaryVersion}</artifactId>
        <version>${sparkVersion}.${clouderaPackage}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_${scalaBinaryVersion}</artifactId>
        <version>${sparkVersion}.${clouderaPackage}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.datasyslab</groupId>
        <artifactId>geospark</artifactId>
        <version>${geoSparkVersion}</version>
    </dependency>
    <dependency>
        <groupId>org.datasyslab</groupId>
        <artifactId>geospark-sql_2.3</artifactId>
        <version>${geoSparkVersion}</version>
    </dependency>
</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.2.1</version>
            <configuration>
                <shadedArtifactAttached>true</shadedArtifactAttached>
                <finalName>${artifactId}-${version}-${jarNameWithDependencies}</finalName>
                <filters>
                    <filter>
                        <artifact>*:*</artifact>
                        <excludes>
                            <exclude>META-INF/*.SF</exclude>
                            <exclude>META-INF/*.DSA</exclude>
                            <exclude>META-INF/*.RSA</exclude>
                        </excludes>
                    </filter>
                </filters>
                <relocations>
                    <relocation>
                        <pattern>org.hsqldb</pattern>
                        <shadedPattern>shaded.org.hsqldb</shadedPattern>
                    </relocation>
                </relocations>
            </configuration>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>
aweis
  • 5,350
  • 4
  • 30
  • 46

1 Answers1

0

Looks like classic dependencies issue, probably your ubber jar container different version of org.hsqldb library, you should try exclude org.hsqldb.* from your dependencies, or shading it. I guess you use maven-shaded-plugin for uber jar? If you do, you can look over here how to exclude dependencies: https://maven.apache.org/plugins/maven-shade-plugin/examples/includes-excludes.html

ShemTov
  • 687
  • 3
  • 8
  • Thanks, that is also what I have tried to do. I updated the question with my shading of the package, but it do not change anything. – aweis Jul 17 '20 at 16:18
  • After package ur jar, open it as zip, do you have directory org.hsqldb? – ShemTov Jul 19 '20 at 15:10
  • I have it under the derinde Shaded location yes – aweis Jul 19 '20 at 16:42
  • Can you post your full pom.xml? – ShemTov Jul 19 '20 at 20:01
  • I have added the additional dependencies I use, but it is only spark libraries, that are all provided when submitting my application to `spark2-submit` – aweis Jul 23 '20 at 13:10
  • Well, i think i know whats happen. You shading the version of hsqldb that imported from geospark, thats mean that geospark will use the hsqldb version on your cluster, and it probably not the version it needed.(Thats why you get NoSuchMethodError) – ShemTov Jul 24 '20 at 07:33
  • Yes, that has also been my conclusion but then I tested with `println((new org.hsqldb.Server).getProductVersion)` which prints out `2.3.0` but i can't see if the shading has taken affect. I also tried to change the `gt-epsg-hsql` dependency to `gt-epsg-wkt` after another advice - but all with no luck so fare! – aweis Jul 24 '20 at 09:47