1

Goal and Problem

I'm trying to compile a minimal version of spark to get our container size down. We only use spark-sql and pyspark. Here's the dockerfile I've been using

FROM openjdk:20-bullseye

RUN apt-get update && \
    apt-get install git -y && \
    git clone --depth=1 --branch=v3.3.0 https://github.com/apache/spark /root/spark && \
    cd /root/spark && \
    ./dev/make-distribution.sh --tgz --pip -pl :spark-core_2.12,:spark-sql_2.12 -P '!test-java-home,kubernetes,hadoop-3,apache-release' -DskipTests

When compiling I get the following error

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.1.1:jar (attach-javadocs) on project spark-core_2.12: MavenReportException: Error while generating Javadoc:
[ERROR] Exit code: 1 - /root/spark/core/src/main/java/org/apache/spark/SparkFirehoseListener.java:36: error: cannot find symbol
[ERROR] public class SparkFirehoseListener implements SparkListenerInterface {
[ERROR]                                               ^
[ERROR]   symbol: class SparkListenerInterface
[ERROR] /root/spark/core/src/main/java/org/apache/spark/SparkFirehoseListener.java:38: error: cannot find symbol
[ERROR]   public void onEvent(SparkListenerEvent event) { }
...

Then there's just a bunch of "error:cannot find symbol" stuff.

Question

How do I fix this so it works?
How do I turn off the documentation from the command line (I would really like to avoid changing files as automating that is perilous)?

FailureGod
  • 332
  • 1
  • 12

0 Answers0