2

I would like to migrate a project from SVN to Git and retain history.

But because there are some configuration files with production passwords and other sensitive data in the SVN repository, I would like to exclude those few files from the migrated history.

How would I go about doing this?

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
Svein Fidjestøl
  • 3,106
  • 2
  • 24
  • 40
  • How many commits have you and how many do you need to remove? – Ôrel Sep 24 '15 at 12:17
  • We have about 10 000 commits and about 10 configuration files. I checked one of the configuration files and it has about 100 commits. Though I wouldn't want to remove the whole commit for these 100 commits. I only want to remove the part of the commit that relates to the configuration file. – Svein Fidjestøl Sep 24 '15 at 12:21
  • http://stackoverflow.com/questions/872565/remove-sensitive-files-and-their-commits-from-git-history – UBCoder Sep 24 '15 at 12:31

2 Answers2

3

The easiest solution would be to migrate your SVN repository to Git on your local machine and then remove the files that contain the sensitive data before you push the migrated history to a remote repository.

For example:

# Migrate the SVN project into a local repo
git svn clone svn://server/svnroot \
    --authors-file=authors.txt \
    --no-metadata \
    -s your_project

cd your_project   

# Remove the 'passwd.txt' file from the history of the local repo
git filter-branch --force --index-filter \
    'git rm --cached --ignore-unmatch passwd.txt' \
    --prune-empty --tag-name-filter cat -- --all

As long as you don't push the local Git repository to a remote location, you can safely remove any file from the entire history using git filter-branch. After the files are removed, it's safe to publish the repo anywhere you want.

An alternative solution to git filter-branch is to use a tool called BFG Repo-Cleaner, which uses its own -supposedly faster- implementation to remove a file from the history of a Git repository. With 10.000 commits it might be worth considering, since the performance of git filter-branch is going to be at least linear to the number of commits to process.

Community
  • 1
  • 1
Enrico Campidoglio
  • 56,676
  • 12
  • 126
  • 154
-2

Basically you have two strategies

  1. Clean up SVN first and then migrate to GIT
  2. Migrate first and then clean up in GIT

Clean up SVN first and then migrate to GIT

According to SVN:

"(...)your only recourse is to svnadmin dump your repository, then pipe the dumpfile through svndumpfilter (excluding the bad path) into an svnadmin load command(...)"

http://subversion.apache.org/faq.html#removal

Migrate first and then clean up in GIT

Github has a good article on this

https://help.github.com/articles/remove-sensitive-data/

UBCoder
  • 659
  • 1
  • 6
  • 7