Goal and Problem
I'm trying to compile a minimal version of spark to get our container size down. We only use spark-sql and pyspark. Here's the dockerfile I've been using
FROM openjdk:20-bullseye
RUN apt-get update && \
apt-get install git -y && \
git clone --depth=1 --branch=v3.3.0 https://github.com/apache/spark /root/spark && \
cd /root/spark && \
./dev/make-distribution.sh --tgz --pip -pl :spark-core_2.12,:spark-sql_2.12 -P '!test-java-home,kubernetes,hadoop-3,apache-release' -DskipTests
When compiling I get the following error
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.1.1:jar (attach-javadocs) on project spark-core_2.12: MavenReportException: Error while generating Javadoc:
[ERROR] Exit code: 1 - /root/spark/core/src/main/java/org/apache/spark/SparkFirehoseListener.java:36: error: cannot find symbol
[ERROR] public class SparkFirehoseListener implements SparkListenerInterface {
[ERROR] ^
[ERROR] symbol: class SparkListenerInterface
[ERROR] /root/spark/core/src/main/java/org/apache/spark/SparkFirehoseListener.java:38: error: cannot find symbol
[ERROR] public void onEvent(SparkListenerEvent event) { }
...
Then there's just a bunch of "error:cannot find symbol" stuff.
Question
How do I fix this so it works?
How do I turn off the documentation from the command line (I would really like to avoid changing files as automating that is perilous)?