0

Running lein uberjar twice in a row, I get two different builds. After some unzip / find / sort / diff shell magic I saw it came down to some Maven file: more specifically the pom.properties file.

Here's a diff:

< #Tue Jan 14 07:07:50 CET 2014
---
> #Tue Jan 14 07:07:01 CET 2014

How can I get deterministic Clojure builds using Leiningen (and hence Maven)?

  • 2
    I don't get it. It looks like that file is supposed to record the time at which the build took place, so of course it will be different across runs. Why do you want to suppress this feature, which is presumably a standard part of maven? – amalloy Jan 14 '14 at 10:17
  • Adding on, I think the usual course of action is to forever archive the binary result. That is then your canonical source of truth. For example, if you compile the same source code on different JDKs, you might get different bytecode. If you wanted truly deterministic builds you would have to "check in" your entire environment (and physical hardware!). – Shepmaster Jan 14 '14 at 14:57
  • 1
    @amalloy: *"deterministic builds"* are ultra-important from a security point of view. And seen all the recent "big brotherish" revelations about backdoors being present in many software and hardware, more and more projects are now adopting deterministic builds. Mozilla Firefox, for example, is now making progress towards deterministic builds. It's a very desirable feature which allows, amongst other, to do compilation on different architecture / hardware (potentially compromised) and then compare the results. A "feature" which prevents deterministic builds is a serious security weakness. – bitcoinNeverSleeps Jan 14 '14 at 17:37
  • I'd add that moreover in a world where we use functional languages, append-only DBs, DVCSes keeping the entire history and where basically it's nearly always possible to recreate the state at any point in time, it just *"makes sense"* to have the option to create deterministic builds. Note that I never said I wanted to suppress any feature. If the time could be, optionally, passed as an option then we'd have the best of both worlds. (And maybe Maven already allows to do that!? But I don't know Maven much...) – bitcoinNeverSleeps Jan 14 '14 at 17:41
  • 1
    This behavior is neither a leiningen, nor a Maven feature, strictly speaking. It comes because the `pom.properties` file is created with [http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html#store(java.io.Writer,%32java.lang.String)](`java.util.Properties`) which stores the current timestamp in a comment header at the top of the file. – Aaron Jan 18 '14 at 04:53
  • 1
    The above is supposed to link to the `store(java.io.Writer, java.lang.String)` method but Oracle's anchor and StackOverflow conspire against me. – Aaron Jan 18 '14 at 05:01

2 Answers2

4

I have a local patch to lein-voom (a project I maintain with Chouser) which will address this, fixing the pom.properties header time to VCS (currently only git) commit time if the working copy is entirely clean. I expect this commit to finalize sometime next week, though I'm still thinking about the configurability of this feature.

This alone doesn't make for stable jars but it is the first trivial piece. Also of interest are timestamps of files within the jar which will change the zip header. Normalizing timestamps should also be straightforward but is a separate step.

Deterministic builds are of interest to lein-voom, a project which may generally be of interest to you since it allows pointing dependencies directly to a particular source version by commit sha, avoiding artifacts altogether.

lein-voom is quite young and the documentation and CLI are pretty rough but the core functionality is solid. Feel free to post issues or questions on the GitHub project.

Aaron
  • 2,487
  • 1
  • 15
  • 7
1

I wrote up an article a while back covering deterministic builds with Maven. I have extracted the salient points here:

Use the assembly plugin and configure it like this:

src/main/assembly/zip.xml:

<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
  <id>deterministic</id>
  <baseDirectory>/</baseDirectory>
  <formats>
    <format>zip</format>
  </formats>
  <fileSets>
    <fileSet>
      <directory>${project.build.directory}/classes</directory>
      <outputDirectory>/</outputDirectory>
    </fileSet>
  </fileSets>
</assembly>

Add in your own MANIFEST.MF remembering the extra CRLF at the end or it won't be valid.

src/main/resources/META-INF/MANIFEST.MF:

Manifest-Version: 1.0
Archiver-Version: Plexus Archiver
Created-By: Apache Maven
Built-By: yourapp
Build-Jdk: 1.7.0

Add some plugins into your pom.xml:

pom.xml:

<plugins>

... other plugins ...

    <!-- Step 1: Set all timestamps to same value -->
    <plugin>
    <artifactId>maven-antrun-plugin</artifactId>
    <version>1.7</version>
    <executions>
      <execution>
        <id>1-touch-classes</id>
        <phase>prepare-package</phase>
        <configuration>
          <target>
            <touch datetime="01/01/2000 00:10:00 am">
              <fileset dir="target/classes"/>
            </touch>
          </target>
        </configuration>
        <goals>
          <goal>run</goal>
        </goals>
      </execution>
    </executions>
    </plugin>

    <!-- Step 2: Assemble as a ZIP to avoid MANIFEST.MF timestamp -->
    <plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <version>2.2.1</version>
    <configuration>
      <descriptors>
        <descriptor>src/main/assembly/zip.xml</descriptor>
      </descriptors>
    </configuration>
    <executions>
      <execution>
        <id>2-make-assembly</id>
        <phase>prepare-package</phase>
        <goals>
          <goal>single</goal>
        </goals>
      </execution>
    </executions>
    </plugin>

    <!-- Step 3: Rename ZIP as JAR -->
    <plugin>
    <artifactId>maven-antrun-plugin</artifactId>
    <version>1.7</version>
    <executions>
      <execution>
        <id>3-rename-assembly</id>
        <phase>package</phase>
        <configuration>
          <target>
            <move file="${project.build.directory}/${project.build.finalName}-deterministic.zip"
                  tofile="${project.build.directory}/${project.build.finalName}-deterministic.jar"/>
          </target>
        </configuration>
        <goals>
          <goal>run</goal>
        </goals>
      </execution>
    </executions>
    </plugin>

... more plugins ...

</plugins>

This will create a deterministic JAR, but it will still depend on the exact version of the JVM and operating system you build it with. To overcome that you should explore the gitian approach used by the Bitcoin Core project and mandate a particular JVM within the VirtualBox environment. In this manner multiple developers can build from the source independently and then sign the binary to state that they are in agreement. When a certain threshold is reached the code is considered proven to be deterministic and can be released.

Gary
  • 7,167
  • 3
  • 38
  • 57