77

What's the best way to go about making a patch for a binary file?

I want it to be simple for users to apply (a simple patch application would be nice). Running diff on the file just gives Binary files [...] differ.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Mike
  • 23,892
  • 18
  • 70
  • 90

9 Answers9

68

Check out bsdiff and bspatch (website, manpage, paper, GitHub fork).

To install this tool:

  • Windows: Download and extract this package. You will also need a copy of bzip2.exe in PATH; download that from the "Binaries" link here.
  • macOS: Install Homebrew and use it to install bsdiff.
  • Linux: Use your package manager to install bsdiff.
MultiplyByZer0
  • 6,302
  • 3
  • 32
  • 48
Heinzi
  • 167,459
  • 57
  • 363
  • 519
  • 2
    Quite old source. It is not easy to compile with modern Visual Studio- with VS 2009 it has worked, but I got errors with newer versions. Furthermore it is only 32-bit- which is a real issue concerning the memory consumption (see other answers). I am not sure, if just compiling with x64 fixes this- I switched to a .NET port, see other answer. – Philm Jul 07 '15 at 17:23
  • 1
    `bsdiff` and `courgette` are optimized for executable binaries; found some [unofficial Windows binaries](http://tpokorra.blogspot.cz/2007/11/bsdiff-for-windows-and-net.html), but it failed right away – Vlastimil Ovčáčík Sep 05 '17 at 07:44
  • On Windows, already using Cygwin and `apt-cyg`, the packages exist there too for easy installation and execution! – Pysis Feb 07 '23 at 17:34
  • Note that the patch files are not humanly readable, which is what I need it for. – user18619318 Apr 03 '23 at 12:41
  • @user18619318: Since binary files are *by definition* not humanly readable, what would you expect such a patch file to look like? – Heinzi Apr 03 '23 at 13:20
  • @heinzi Same as for text diffs. Something that tells me that for instance in position X bytes A, B, C are present in file 1 but not file 2, and in position Y, four bytes are present in file 2 but not file 1, along with their values. – Thorbjørn Andersen - UFST Apr 04 '23 at 14:05
31

Courgette, by the Google Chrome team, looks like most efficient tool for binary patching executables.

To quote their data:

Here are the sizes for the recent 190.1 -> 190.4 update on the developer channel:

  • Full update: 10,385,920 bytes
  • bsdiff update: 704,512 bytes
  • Courgette update: 78,848 bytes

Here are instructions to build it. Here is a Windows binary from 2018 courtesy of Mehrdad.

Community
  • 1
  • 1
Maxim Kholyavkin
  • 4,463
  • 2
  • 37
  • 82
  • 18
    The document says, "we wrote a new diff algorithm that knows more about the kind of data we are pushing - large files containing compiled executables". The implication is that it won't work as well (or maybe not at all) for other binary files. – James Jul 08 '14 at 10:56
  • 2
    Thank you for that link. But it is a real story to get it compiled under Windows. It installs a whole developer system first, e.g. Git, Python, etc. Maybe it works, but on my machine, the fetch has used some ports which were secured and failed. Anybody knows a binary download link? – Philm Jul 07 '15 at 17:40
  • 2
    @James Courgette is a true successor of `bsdiff`. From the document: Courgette `diff = bsdiff(concat(original, guess), update)`. With a reasonable `bdiff` algorithm you have `len(bdiff(concat(original,guess),update)) < len(bdiff(original,update))+C` with a small (constant) `C`. Having `C` set to 10 is a safe bet. Perhaps someone can calculate the `C` for `bsdiff`. Note that C==1 if the given `bdiff` algorithm guarantees `len(bdiff(concat(original,random),update)) <= len(bdiff(original,update))` for any values of original, random and update. – Tino Jun 30 '16 at 07:16
  • 1
    Unlike bsdiff's output, which is already compressed (with bzip2), you can further reduce the size of Courgette's output by using something like gzip or lzma on it. – MultiplyByZer0 Feb 04 '19 at 07:23
26

xdelta (website, GitHub) is another option. It seems to be more recent, but otherwise I have no idea how it compares to other tools like bsdiff.

Usage:

  • Creating a patch: xdelta -e -s old_file new_file delta_file
  • Applying a patch: xdelta -d -s old_file delta_file decoded_new_file

Installation:

  • Windows: Download the official binaries.
  • Chocolatey: choco install xdelta3
  • Homebrew: brew install xdelta
  • Linux: Available as xdelta or xdelta3 in your package manager.
MultiplyByZer0
  • 6,302
  • 3
  • 32
  • 48
Jared Beck
  • 16,796
  • 9
  • 72
  • 97
  • Windows binaries: [official xdelta3](https://github.com/jmacd/xdelta-gpl/releases), [unofficial xdelta](http://www.evanjones.ca/software/xdelta-win32.html). – Vlastimil Ovčáčík Sep 05 '17 at 07:49
  • This just saved me hours. Needed to test a certain build of an exe self extracting installer that was 1.1 GB. Copying that over the vpn was going to take 2.5 hours. I already had a different release from 3 months ago... Followed your instructions, the generated patch was (luckily) 18MB - guess there have only been minor changes. Applied the patch on the remote system. Performed various checksums on newly patched exe and it matches on both systems. There are so many ways this could have not worked but in my case it worked perfectly! – Ryan Feb 17 '21 at 23:50
  • I just tried `xdelta` and it has different command line commands. Like that: `xdelta delta old_file new_file delta_file` and `xdelta patch delta_file old_file decoded_new_file` – Mariusz Pawelski Nov 06 '22 at 00:59
  • scarab by Alexey Baskokov builds upon XDelta to add directory diffs. https://github.com/loyso/Scarab – Steve F Jun 20 '23 at 01:16
9

For small, simple patches, it's easiest just to tell diff to treat the files as text with the -a (or --text) option. As far as I understand, more complicated binary diffs are only useful for reducing the size of patches.

$ man diff | grep -B1 "as text"
       -a, --text
              treat all files as text
$ diff old new
Binary files old and new differ
$ diff -a old new > old.patch
$ patch < old.patch old
patching file old
$ diff old new
$

If the files are the same size and the patch just modifies a few bytes, you can use xxd, which is commonly installed with the OS. The following converts each file to a hex representation with one byte per line, then diffs the files to create a compact patch, then applies the patch.

$ xxd -c1 old > old.hex
$ xxd -c1 new > new.hex
$ diff -u old.hex new.hex | grep "^+" | grep -v "^++" | sed "s/^+//" > old.hexpatch
$ xxd -c1 -r old.hexpatch old
$ diff old new
$

For shells that support process substitution such as bash and zsh, there is a simpler method available:

$ comm -13 <(xxd -c1 old) <(xxd -c1 new) > old.hexpatch 
$ xxd -c1 -r old.hexpatch old
$ diff old new
$

Here the comm -13 removes lines that appear only in the first input as well as lines that appear in both inputs, leaving only the lines exclusive to the second input.

cjfp
  • 153
  • 1
  • 9
  • 1
    I really like the plaintext patch option provided by xxd. However, by default, GNU diff seems to prefix lines with `<` and `>`; to get it to prefix lines with `+` and `-` I had to use `diff -u`, e.g. `diff -u old.hex new.hex | grep "^+" | grep -v "^++" | sed "s/^+//" > old.hexpatch` – bmaupin Aug 14 '22 at 18:59
  • 1
    ... also, here's a one-liner that will create the patch without the intermediate files: `comm -13 <(xxd -c1 old) <(xxd -c1 new) > old.hexpatch` – bmaupin Aug 15 '22 at 16:17
  • @bmaupin thanks! I added both fixes. I've had diff aliased to diff -ur for so long that I completely forgot about the default format – cjfp Aug 15 '22 at 20:24
6

Modern port: Very useful .NET port for bsdiff/bspatch:

https://github.com/LogosBible/bsdiff.net

My personal choice.

I tested it, and it was the only one of all links. Out of the box I was able to compile it (with Visual Studio, e.g., Visual Studio 2013). (The C++ source elsewhere is a bit outdated and needs at least a bit polishing and is only 32 bit which sets real memory (diff source size) limits. This is a port of this C++ code bsdiff and even tests if the patch results are identical to original code.)

Further idea: With .NET 4.5 you could even get rid of the #Zip library, which is a dependency here.

I haven't measured if it is slightly slower than the C++ code, but it worked fine for me, (bsdiff: 90 MB file in 1-2 minutes), and time-critical for me is only the bspatch, not the bsdiff.

I am not really sure, if the whole memory of a x64 machine is used, but I assume it. The x64 capable build ("Any CPU") works at least. I tried with a 100 MB file.

- Besides: The cited Google project 'Courgette' may be the best choice if your main target are executable files. But it is work to build it (for Windows measures, at least), and for binary files it is also using pure bsdiff/bspatch, as far as I have understood the documentation.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Philm
  • 3,448
  • 1
  • 29
  • 28
2

HDiffPatch can run on Windows, macOS, Linux, and Android.

It supports diffs between binary files or directories;

Creating a patch: hdiffz [-m|-s-64] [-c-lzma2] old_path new_path out_delta_file

Applying a patch: hpatchz old_path delta_file out_new_path

Install:

Download from last release, or download the source code & make;

Jojos Binary Diff is another good binary diff algorithm;

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
sisong
  • 91
  • 1
  • 5
  • There is one gotcha with the current version: it lets you apply the same patch multiple times, i.e. there's no crc checking before applying. I've accidentally mangled some files that way, because the hdiffs can also add bytes, not just substitute. The sfx feature is pretty nice... but no cigar because it can't record the name(s) of the files, so you still need some extra script to store those names (at least when diffing individual files, not dirs). – Fizz May 20 '22 at 02:56
1

diff and git-diff can handle binary files by treating them as text with -a.

With git-diff you can also use --binary which produces ASCII encodings of binary files, suitable for pasting into an email for example.

qwr
  • 9,525
  • 5
  • 58
  • 102
1

https://github.com/reproteq/DiffPatchWpf DiffPatchWpf DiffPatchWpf simple binary patch maker tool.

Compare two binary files and save the differences between them in new file patch.txt

Apply the patch in another binary fast and easy.

Now you can apply the differences in another binary quickly and easily.

example:

1- Load file Aori.bin

2- Load file Amod.bin

3- Compare and save Aori-patch.txt

4- Load file Bori.bin

5- Load patch Aori-patch.txt

6- Apply patch and save file Bori-patched.bin

alt tag

https://youtu.be/EpyuF4t5MWk

Microsoft Visual Studio Community 2019

Versión 16.7.7

.NETFramework,Version=v4.7.2

Tested in windows 10x64bits

-7

Assuming you know the structure of the file you could use a C / C++ program to modify it byte by byte:

http://msdn.microsoft.com/en-us/library/c565h7xx(VS.71).aspx

Just read in the old file, and write out a new one modified as you like.

Don't forget to include a file format version number in the file so you know how to read any given version of the file format.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 6
    This solution is insane. Using C / C++ when `sed` already does everything you could ever want. Or, if you'd prefer to use an industrial-strength portable programming language, `perl`'s your best bet. If I'm writing router firmware, of course I'll go with C or C++, but diffing...? – Parthian Shot Jun 05 '15 at 23:33
  • What is the link to? To the installer for [Visual Studio 2003](https://en.wikipedia.org/wiki/Microsoft_Visual_Studio#.NET_2003)? The link redirects. – Peter Mortensen May 20 '22 at 17:49
  • 1
    This is essentially a link-only answer. – Peter Mortensen May 20 '22 at 17:50