sysadmin1138 and Martin have reported a replacement for rsync that works on block devices (partitions). It is based on perl, but I want to store two-way diffs.
It applies changes in a block device to a preexisting outdated backup image. This is the second best to do that, after lvmsync that I did not use because my block device is not in lvm.
But I wanted also to collect separately the changes, in order to be able to regenerate the previous backup image (e.g., to recover a deleted file).
The following code does collect these changes, when the rsync remplacement runs:
patch=diff.`date +'%Y%m%d.%H%M%S.%N'`.gz
ssh $username@$backupnas "perl -'MDigest::MD5 md5' -ne "\
" 'BEGIN{\$/=\1024};print md5(\$_)' $remotepartition "\
" | gzip -c "\
|gunzip -c|LANG= tee >(wc -c|LANG= sed '1s%^%number of 64 bytes blocs: %' >&2) \
|LANG= perl -'MDigest::MD5 md5' -e 'open DISK,"'"<$partition"'" or die $!; '\
' while( read DISK,$read,1024) '\
' { '\
' read STDIN,$md,16; '\
' if($md eq md5($read)) {print "s"} else {print "c" . $read } '\
' } '\
| gzip -c \
|ssh $username@$backupnas "touch $remotepartition;LANG= tee -a $patch|gunzip -c"\
" |perl -e 'open REVP,\"| gzip -c > rev.$patch\"; "\
" open PREVIOUS,\"<$remotepartition\"; "\
' $rev = "PREVIOUS met EOF if length<1024."; $rev=$rev.$rev; '\
' $rev=$rev.$rev.$rev.$rev; $rev=$rev.$rev.$rev.$rev; '\
' while(read STDIN,$read,1) '\
' { '\
' if ($read eq "s") '\
' { '\
' if (length($rev) eq 1024) { print REVP "s" } ; '\
' $s++ '\
' } else { '\
' if ($s) { seek STDOUT,$s*1024,1; seek PREVIOUS,$s*1024,1; $s=0}; '\
' if (read PREVIOUS,$rev,1024) { print REVP "c".$rev }; '\
' read STDIN,$buf,1024; '\
' print $buf '\
' } '\
" }' 1<> $remotepartition "
$rev
is initialized to a scalar string of length 1024 (I don't know how to make it better).
Without the formatting and with more or die
, this is:
patch=essai_delta.`date +'%Y%m%d.%H%M%S.%N'`.gz
ssh username@backupnas "perl -'MDigest::MD5 md5' -ne 'BEGIN{\$/=\1024};print md5(\$_)' essai_backup | gzip -c" | \
gunzip -c | LANG= tee >(wc -c|LANG= sed '1s%^%bin/backup_essai: number of 64 bytes blocs treated : %' >&2) | \
LANG= perl -'MDigest::MD5 md5' -e 'open DISK,"</data/data/com.spartacusrex.spartacuside/files/essai" or die $!; while( read DISK,$read,1024) { read STDIN,$md,16; if($md eq md5($read)) {print "s"} else {print "c" . $read } }' /data/data/com.spartacusrex.spartacuside/files/essai | \
gzip -c | \
ssh username@backupnas "LANG= tee -a $patch | gunzip -c | perl -e 'open REVP,\"| gzip -c > rev.$patch\" or die \$!; open READ,\"<essai_backup\" or die \$!; \$rev = \"if length<1024, EOF met in READ.\"; \$rev=\$rev.\$rev.\$rev.\$rev; \$rev=\$rev.\$rev.\$rev.\$rev; \$rev=\$rev.\$rev; while(read STDIN,\$read,1) { if (\$read eq \"s\") {if (length(\$rev) eq 1024) { print REVP \"s\" or die \$! } ; \$s++} else { if (\$s) { seek STDOUT,\$s*1024,1 or die \$!; seek READ,\$s*1024,1 or die \$!; \$s=0}; if (read READ,\$rev,1024) { print REVP \"c\".\$rev or die \$! } else { print STDERR \$!}; read STDIN,\$buf,1024 or die \$!; print \$buf or die \$!} }' 1<> essai_backup"
To apply the forward or backward diff, I can use:
ssh username@backup_nas "LANG= cat diff_delta.20141202.110302.0935 | gunzip -c | perl -ne 'BEGIN{\$/=\1} if (\$_ eq\"s\") {\$s++} else {if (\$s) { seek STDOUT,\$s*1024,1; \$s=0}; read STDIN,\$buf,1024; print \$buf}' 1<> image.file"
So I succeeded to answer first version of this post. This was tested on an example of 200k with some modifications.
I have specific questions about this code.
Why did the original example used read ARGV
, is it bad practice ?
I have put many or die $!
, is it wise or does it just destroy readability ?
PREVIOUS
and STDOUT
are the same file opened twice (to avoid seek STDOUT,-1024,1
), is it considered good practice ?
[question migrated manually from programmers.so]