0

I've run a cvs2git migration on a CVS repository that's over 2 GB. I wrote a script traverses the new git repository and the CVS module to verify that the objects are the same. I've found that the text files migrate just fine and have the same sha1sum; however, ALL of the binary files have different sha1sums and they are all flagged as binary in CVS (-kb). Every other topic I've read about cvs2git and binary files usually blame the issue on binary files not being flagged as binar (-kb), but that's not the case here. What else could be the problem?

The scripts I execute to do the migration are below:

./Python-2.7.3/python ./cvs2svn-trunk/cvs2git \
--blobfile=/path/to/git-blob.dat \
--dumpfile=/path/to/git-dump.dat \
--username=cvs2git \
/cvsroot/database

cd /gitroot; mkdir database; cd database; git init

cat /path/to/git-{blob,dump}.dat | git fast-import
joshm1
  • 553
  • 1
  • 10
  • 21

2 Answers2

1

Your problem could be explained if your repository is a CVSNT repository, as opposed to a standard CVS repository. CVS records once, for all revisions whether a file is binary, whereas CVSNT records the file type revision by revision. cvs2svn/cvs2git only reads the file-wide binary attribute, not CVSNT's revision-by-revision attributes. Therefore, it doesn't know that a file has been marked binary in CVSNT.

This is the main reason that cvs2svn/cvs2git does not officially support converting from CVSNT repositories.

mhagger
  • 2,687
  • 18
  • 13
1

Do these binary files contain some strings in the form of $Id ...$? That was the problem for me some time ago (it replaced it with $Id$ in binary files), but it should be fixed in the newest versions, see this commit.

In any case, I recommend using a hex editor to find out what the differences actually are.

I also notice that you don't use an options file. I'm not sure what defaults cvs2git uses then, but it would be worth a try use a customized version of cvs2git-example.options.

robinst
  • 30,027
  • 10
  • 102
  • 108
  • I'm seeing this exact problem. I'm using the latest version of cvs2git (2.4.0). I will try using the options file method, since I am using the command-line. – caspian Dec 02 '13 at 16:54
  • Hmmm - so using the default options file I get the same issue. If I change to always set KeywordHandlingPropertySetter to untouched it works. I AM using CVSNT so I think the comment from @mhagger is applicable to me. – caspian Dec 02 '13 at 17:42