3

A planned software upgrade causes stricter SQL parsing of Flyway migration scripts. The syntax needs to fixed, but this will change the checksum and fail Flyway's validation. The semantics of the SQL do not change. Is there of making the scripts legal without clumsily repairing databases?

It looks like a 32-bit checksum, so that is unlikely to be secure. Ideally I'd like:

  • just a few magic printable US ASCII letters in a comment at the top of the file
  • not require me to give my SQL away
  • generated by code that I can understand
  • not need any special hardware or configuration

Does anyone have any cunning techniques?

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • Could you baseline your target, so it effectively ignores migrations that have already been applied? – David Atkinson Oct 31 '22 at 10:33
  • @DavidAtkinson I understand baselining in Flyway involves forking out for the Team Edition, rather than sticking with the Community Edition. Perhaps that might be cost effective. – Tom Hawtin - tackline Oct 31 '22 at 13:03
  • The "flyway baseline" command is available in Flyway Community. It instructs the target to ignore migrations below a specific version number, so it might mean (as I've not tried it) that incorrect checksums in migration scripts below the baseline number will not cause Flyway to fail validation. – David Atkinson Nov 01 '22 at 10:00
  • "It looks like a 32-bit checksum". So what is it? Is it a CRC? The code in your answer uses a standard CRC-32. – Mark Adler Nov 02 '22 at 14:41
  • @MarkAdler Yes the 32-bit checksum is a CRC-32 (after some light processing of the source), as described in accepted answer of the first link of my answer. – Tom Hawtin - tackline Nov 02 '22 at 14:51
  • Then you should update your question with what it actually is. – Mark Adler Nov 02 '22 at 15:07
  • @MarkAdler It is a checksum. I think precisely what it is belongs in the answer - it is something I had to go search for. – Tom Hawtin - tackline Nov 02 '22 at 15:11
  • Odd philosophy. You want anyone who might try to answer the question to have to go searching for what you've already found. Knowing it's a CRC makes the solution much more tenable, so someone reading the question would be more likely to tackle it if that information is up front. By the way, no, a CRC is not a checksum. – Mark Adler Nov 02 '22 at 15:32
  • @MarkAdler `java.util.zip` thinks CRC-32 is-a checksum. From my answer: `Checksum checksum = new CRC32()`. Wikipedia(!) on Checksum: "A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data integrity but are not relied upon to verify data authenticity." – Tom Hawtin - tackline Nov 02 '22 at 15:56
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/249252/discussion-between-mark-adler-and-tom-hawtin-tackline). – Mark Adler Nov 02 '22 at 16:03

2 Answers2

2

You can use spoof to tell you how to change the bits in positions you specify in the message to get the desired CRC.

It is monumentally faster than just randomly poking at the message until you get the CRC you want.

Here is an example of spoofing a message, which can be adapted to your application:

// Example of spoofing a CRC. This takes a message with the pattern AAAAAAAAA
// and a desired CRC, and modifies the pattern to upper and lower case letters
// such that it has the desired CRC. The standard CRC-32/ISO-HDLC is used.

import java.io.*;
import java.nio.file.*;

class spoofcrc {
    // Return the standard CRC-32 of data[off..off+len-1] with initial CRC crc.
    private static int crc32(int crc, byte[] data, int off, int len) {
        crc = ~crc;
        for (int i = off; i < off + len; i++) {
            crc ^= (int)data[i] & 0xff;
            for (int k = 0; k < 8; k++)
                crc = (crc & 1) != 0 ? (crc >>> 1) ^ (int)0xedb88320 :
                                       crc >>> 1;
        }
        return ~crc;
    }

    // Return the reverse CRC-32 computed on data[off..off+len-1], with final
    // CRC crc. When given the result of crc32() as crc, crc32rev() will return
    // the initial crc value given to crc32() on the same data.
    private static int crc32rev(int crc, byte[] data, int off, int len) {
        crc = ~crc;
        for (int i = off + len - 1; i >= off; i--) {
            for (int k = 0; k < 8; k++)
                crc = (crc & 0x80000000) != 0 ? (crc << 1) ^ (int)0xdb710641 :
                                                crc << 1;
            crc ^= (int)data[i] & 0xff;
        }
        return ~crc;
    }

    // Pre-computed matrix inverse of CRCs of select modified bit positions.
    private static final byte[][] inv =
        {{0x20,0x02,0x2e,0x04,0x02,0x0c,0x2e,0x08,0x08},
         {0x0e,0x2c,0x24,0x26,0x0a,0x22,0x20,0x08,0x24},
         {0x08,0x22,0x28,0x2c,0x02,0x28,0x24,0x0c,0x00},
         {0x0e,0x2c,0x00,0x0a,0x02,0x0c,0x0a,0x08,0x2c},
         {0x2e,0x28,0x00,0x2e,0x02,0x2e,0x24,0x04,0x24},
         {0x26,0x2a,0x2a,0x02,0x06,0x0a,0x22,0x04,0x00},
         {0x0e,0x2a,0x28,0x06,0x0e,0x08,0x0a,0x08,0x28},
         {0x0e,0x0a,0x2c,0x04,0x0a,0x0e,0x06,0x04,0x00},
         {0x24,0x02,0x28,0x22,0x06,0x28,0x22,0x08,0x2c},
         {0x20,0x24,0x0c,0x22,0x08,0x0e,0x08,0x0c,0x00},
         {0x0e,0x2a,0x00,0x00,0x06,0x24,0x04,0x00,0x04},
         {0x00,0x0a,0x20,0x06,0x06,0x22,0x02,0x00,0x08},
         {0x20,0x04,0x08,0x06,0x00,0x06,0x04,0x08,0x0c},
         {0x2a,0x28,0x00,0x2c,0x02,0x20,0x2e,0x0c,0x0c},
         {0x08,0x28,0x0e,0x02,0x08,0x06,0x22,0x0c,0x08},
         {0x20,0x2e,0x04,0x0e,0x0e,0x04,0x0c,0x0c,0x28},
         {0x0e,0x02,0x2c,0x06,0x04,0x00,0x0a,0x0c,0x00},
         {0x04,0x2c,0x22,0x00,0x02,0x06,0x04,0x08,0x20},
         {0x08,0x2c,0x02,0x2c,0x02,0x2e,0x00,0x08,0x00},
         {0x2e,0x0c,0x04,0x08,0x0e,0x04,0x2a,0x00,0x00},
         {0x06,0x2e,0x2c,0x2e,0x06,0x24,0x28,0x00,0x24},
         {0x28,0x26,0x2c,0x2c,0x0c,0x0e,0x20,0x04,0x04},
         {0x24,0x2a,0x02,0x02,0x00,0x22,0x28,0x04,0x28},
         {0x28,0x26,0x04,0x0c,0x0a,0x08,0x2c,0x00,0x20},
         {0x20,0x2e,0x0c,0x0e,0x0c,0x0a,0x02,0x00,0x20},
         {0x26,0x2e,0x06,0x20,0x06,0x08,0x24,0x00,0x08},
         {0x04,0x04,0x2a,0x0e,0x0c,0x0c,0x24,0x00,0x24},
         {0x2a,0x20,0x26,0x02,0x08,0x28,0x20,0x04,0x08},
         {0x20,0x2c,0x0e,0x2a,0x00,0x02,0x0e,0x04,0x08},
         {0x06,0x28,0x2c,0x28,0x04,0x02,0x06,0x08,0x0c},
         {0x22,0x0c,0x24,0x2e,0x0c,0x2e,0x2c,0x00,0x0c},
         {0x2c,0x2c,0x28,0x22,0x00,0x28,0x0c,0x0c,0x24}};

    // Read the message from stdin, expected to have the pattern AAAAAAAAA in
    // it. Modify that pattern to other upper or lower case letters to get the
    // desired CRC provided as the first argument. The argument can be provided
    // in decimal, or in hexadecimal by preceding the digits with "0x". Write
    // the result to stdout.
    public static void main(String[] args) throws IOException {
        // Get the desired CRC.
        if (args.length != 1) {
            System.err.println("Need one argument: the desired CRC.");
            System.exit(1);
        }
        int want = (int)(Long.decode(args[0]) & 0xffffffff);

        // Get the message from stdin and find the pattern to modify.
        byte[] msg = System.in.readAllBytes();
        final int plen = 9;
        int pos = 0, hits = 0;
        while (pos < msg.length)
            if (msg[pos++] != (byte)'A')
                hits = 0;
            else if (++hits == plen)
                break;
        if (hits < plen) {
            System.err.println("Could not find AAAAAAAAA in input.");
            System.exit(1);
        }

        // Modify the pattern to get the desired CRC.
        int xor = crc32(0, msg, 0, pos) ^
                  crc32rev(want, msg, pos, msg.length - pos);
        for (int i = 0; i < inv.length; i++) {
            if ((xor & 1) != 0)
                for (int j = 0; j < plen; j++)
                    msg[pos - plen + j] ^= inv[i][j];
            xor >>>= 1;
        }

        // Write the modified message to stdout.
        System.out.write(msg);
    }
}
Mark Adler
  • 101,978
  • 13
  • 118
  • 158
1

Great question. Unless the algorithm is designed to be particularly difficult, such as bcrypt, naively zipping through billions of possibilities for the 1 in 2^32 (~4 billion) chance ought to be doable. In fact Flyway munges the script and then applies the well known CRC32 error-detection code (whole process described here).

Whilst an inverse CRC32 function exists, it is much easier to brute force it. The technique also works for cryptographic hashes. Some CPUs have hardware CRC32 acceleration to make this even quicker. Longer files will take longer. If Java had a more extensive API, putting the bodged letters at the end could be used to speed it up.

The code below attempts to find a seven capital letter solution - 26^7 (~8 billion) guesses. Pass the desired checksum as an argument to the program and pipe the source SQL migration script through standard input. For convenience the program will print its calculation of the Flyway checksum for the original file and then, after some time, the first solution it finds without new lines. There may not be any solutions (there isn't one for the exact program itself), in which case try again with a minor change to the file.

java ReverseFlyway.java 16580903 < V42__add_bark.sql

Put the string XXXXXXX in the place where you want the text to be modified.

It is important that the semantics of the SQL do not change. It's unfortunately very easy to chang semantics of the script whilst retaining its checksum. For instance,

-- Robert-DROP TABLE Students;

has the same Flyway checksum as

-- Robert-
DROP TABLE Students;

(Moral: Normalise, don't delete sections.)

Exact details of how Flyway is implemented may change between versions. If you have weird stuff, such as BOMs, something might need to be modified.

If you prefer, the code is easily changed to search for two or three words, a number of spaces and tabs, a limerick, or whatever takes your fancy.

import java.io.*;
import java.util.zip.*;

class ReverseFlyway {
    private final Checksum checksum = new CRC32();
    private final int target;
    private final byte[] data;

    public static void main(String[] args) throws IOException {
        /** /
        new ReverseFlyway("Magic 'XXXXXXX'", Integer.MIN_VALUE);
        /*/
        String text = loadText();
        new ReverseFlyway(text, Integer.parseInt(args[0]));
        /**/
    }
    private ReverseFlyway(String text, int target) {
        this.target = target;
        this.data = text.getBytes();
        System.err.println(checksum());
        int magicLen = 7;
        int place = text.indexOf("X".repeat(magicLen));
        attempt(place, magicLen);
        System.err.println("No solutions found");
        System.exit(1);
    }
    private int checksum() {
        checksum.reset();
        checksum.update(data);
        return (/** /short/*/int/**/) checksum.getValue();
    }
    private void attempt(int place, int remaining) {
        if (remaining == 0) {
            if (target == checksum()) {
                System.out.println(new String(data));
                System.exit(0);
            }
        } else {
            for (byte letter = 'A'; letter <= 'Z'; ++letter) {
                data[place] = letter;
                attempt(place+1, remaining-1);
            }
        }
    }
    private static String loadText() throws IOException {
        StringBuilder buff = new StringBuilder();
        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        for (;;) {
            String line = in.readLine();
            if (line == null) {
                return buff.toString();
            }
            buff.append(line);
        }
    }
}
Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305