12

I am trying to create a patch using two large size folders (~7GB).

Here is how I'm doing it :

$ diff -Naurbw . ../other-folder > file.patch

But maybe due to file sizes, patch is not getting created and giving an error:

diff: memory exhausted

I tried making space more than 15 GB but still the issue persists. Could someone help me out with the flags that I should use?

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
pritam
  • 2,452
  • 1
  • 21
  • 31
  • 1
    yes, i'd tried googling it and found some parameter changes and the "memory exhausted" error is still there, even if using "--speed-large-files" flag. – pritam Mar 07 '13 at 06:15
  • 1
    How about diffing them in multiple steps? e.g. split the folders into, say, 1GB blocks, diff, then concatenate the patch, though I'm not sure if diff can be split like that (so you might need some extra logic to apply the patch). Why are you diffing 7GB folders in the first place? Surely only some files/folders inside it have changed? – Thomas Mar 07 '13 at 06:19
  • yes, i tried diffing them separately and creating differebt patches abd merging them but the patch does not get applied. While creating a single patch size of patch goes to 800KB but after merging it becomes 90KB and it's not getting applied. – pritam Mar 07 '13 at 06:37
  • @pritam check out my answer above, sir – Igor Apr 08 '15 at 07:33

3 Answers3

19

Recently I came across this too when I needed to diff two large files (>5Gb each).

I tried to use 'diff' with different options, but even the --speed-large-files had no effect. Other methods like splitting the files into smaller ones, using xdelta or sorting the files as per this suggestion didn't help either. I even got my hands around a very powerful VM (> 72Gb RAM), but still got this memory exhausted error.

I finally got to work by adding the following parameter to sysctl.conf (sudo vim /etc/sysctl.conf):

vm.overcommit_memory=1

vm.overcommit_memory has three values (0,1,2) and sets the kernel virtual memory accounting mode. From the proc(5) man page:

0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit

To make sure that the parameter is indeed applied you can run

sudo sysctl -p

Don't forget to change this parameter back when you finish!

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Igor
  • 416
  • 5
  • 10
  • 1
    I agree, interesting, non-standard, and it worked for me! Comparing two 70GB files, I see e.g. 317TB virtual and 150TB resident RAM... a comparison that could not run before even with 250GB RAM now completes. Very clever! – David W Mar 17 '16 at 17:02
  • Excelent, also you can add swap space https://askubuntu.com/questions/349156/how-to-use-hard-disk-as-ram-like-in-windows – YOGO Mar 01 '19 at 08:21
  • 1
    Thank you. It worked for me like this: ```cat /proc/sys/vm/overcommit_memory```, ```echo 1 > /proc/sys/vm/overcommit_memory```, diff, ```echo 0 > /proc/sys/vm/overcommit_memory```, ```cat /proc/sys/vm/overcommit_memory``` – matt Jul 04 '20 at 20:27
2

bsdiff is slow & requires large memory, xdelta is create large delta for large files.

Try HDiffPatch for large files: https://github.com/sisong/HDiffPatch

  • support diff between large binary files or directories;
  • can run on: Windows, macos, Linux, Android
  • diff & patch both support run with limit memory;

Usage example:

  • Creating a patch: hdiffz -s-256 [-c-lzma2] old_path new_path out_delta_file
  • Applying a patch: hpatchz old_path delta_file out_new_path
Alexey Vazhnov
  • 1,291
  • 17
  • 20
sisong
  • 91
  • 1
  • 5
-2

Try sdiff. It's a pre-built tool in some Linux Distributions.

sdiff a.txt b.txt --output=c.txt

will show the files to be Modified.

This worked perfectly for me.

MLavoie
  • 9,671
  • 41
  • 36
  • 56
Javeed Shakeel
  • 2,926
  • 2
  • 31
  • 40