4

The current repository has commits

  A -> B -> C
            ^
            |
         HEAD

I want to create a new repository whose master branch begins at commit C (HEAD) of the current repository.

  C
  ^
  |
HEAD

Additionally, if a new commit D is added to the current repository:

  A -> B -> C -> D
                 ^
                 |
                HEAD

The new repository will become:

  C -> D
       ^
       |
      HEAD

On the next push/mirror.

Due to learning to develop software while I committed changes to the project, the repository has gone in size due to large files being added and removed through it's long history (500 commits).

Can this workflow be easily achieved in git? (Using both GitHub and GitLab)

doak
  • 809
  • 9
  • 24
Sean Pianka
  • 2,157
  • 2
  • 27
  • 43
  • 1
    Why do you want to create a new repository **and** keep the old one? – Leon Dec 20 '18 at 06:10
  • I would like to keep all future history in a new public repository, while leaving old commits (past a certain point in time) in the old repository. – Sean Pianka Dec 20 '18 at 07:06
  • @SeanPianka, the bounty will be awarded to the currently accepted answer if you do not take care! – doak Dec 26 '18 at 18:32

3 Answers3

3

You can easily create a new repo from an existing one, at least locally: just git clone <src repo> [dest dir] (possibly using --depth or similar to save on size, though that comes with caveats, see the manual for details). Making that new repository automatically follow the original's history isn't going to be easy though. The new repo will have its origin setup to point at the original, but updating will require a pull/fetch+merge/whatever as usual. You may be able to set up some post-commit hook in the old repo to automate the cd <new repo> ; git pull ; cd $OLDPWD bit, I'm not well-versed in how git's hooks work. Alternately, you could set up the new repo as a remote in the old one and push to it, though I'm not sure how that'd affect the new repo's working tree (i.e., what's checked out). And making any of this work with a remote provider like GitHub would be an entirely different can of worms.

If you want to try cleaning up your history, you may want to look into rebase and possibly cherry-pick.

solarshado
  • 567
  • 5
  • 18
2

What you are trying to do is close to impossible. In git, the history leading to a given commit is an inseparable part of that commit. Thus the commits denoted with C in the following two histories

  A -> B -> C
            ^
            |
         HEAD

and

  C
  ^
  |
HEAD

are in fact two different commit objects, most likely having two separate hashes. The only way to achieve the desired setup would be to tweak those two different commit objects to have the same hash value, in which case you can fool git into pushing new commits based on C into different repositories with different prehistories. This can be achieved in theory but hardly in practice (if you manage to do that, then you will also be able to hack digitally signed documents or alter the bitcoin blockchain).

An approximation of the desired flow will be to maintain two branches in your local repository corresponding to the two remotes. You will work on one of the branches, and merge it regularly into the other branch:

old_repo_branch:      A -> B -> C ---->  D' -> E'
                                         ^     ^
                                        /     /
                                       /     /
new_repo_branch:                C' -> D --> E

You will have to push new_repo_branch to the new repository, and old_repo_branch to old repository. But such a flow will become hard to manage if you need to branch your development (since each of the parallel development streams will need to be branched and each pair of respective branches will similarly need to be kept in sync).

Leon
  • 31,443
  • 4
  • 72
  • 97
2

Preamble

You should really consider your workflow. Most likely you are trying to achieve a strange workflow copied from some ancient VCS. Git is used to track history and let you rewrite it. But you need to make a decision which history you want. Doing variant management regarding history is probably a bad idea.

500 commit is not a big number for Git, the Linux kernel got about 63.000 (!) commits just in 2018 ;)

Solution

Nevertheless, here is a hacky proof of concept which fulfils your needs. There is no need for a dedicated repository, the rewritten history is just stored in some dedicated branch. The first run will create that orphan branch, subsequent runs will update it with the latest commits. Both calls look the same:

$ path/to/crazy-rebase <rewritten-branch> <last-commit-to-transfer>

For example:

$ ./crazy-rebase cutoff master

How it works

During the first run, the script creates an orphan branch (e.g. cutoff) from given revision (e.g. master) without any previous history. All further runs will cherry-pick every single commit (not yet present) to this orphaned branch (using a rebase). The needed commits are deduced from the last successful completion (in fact this is stored in special reference CUTOFF_BASE).

Script crazy-rebase:

#!/usr/bin/env bash

CUTOFF="$1"
CURRENT="$2"

LAST_BASE="CUTOFF_BASE"


error() {
    local errcode=$?
    echo "ERR: $*" >&2
    exit $errcode
}

log() {
    echo "LOG: $*" >&2
}

ret() {
    return "$1"
}


prepare() {
    local cutoff="$1"
    local current="$2"
    local base_hash

    git show-ref --quiet "$cutoff" &&
    return 0

    log "Preparing cut-off branch '$cutoff' ..." &&
    base_hash="`git show -s --pretty=%H "$current"`" &&
    git checkout --quiet --orphan="$cutoff" "$current" &&
    git commit -m "Cutoff branch, based on '$base_hash'" &&
    git checkout --quiet "$current" &&
    git update-ref "$LAST_BASE" "$base_hash" &&
    log "Cut off branch '$cutoff' created." &&
    exit 0 ||
    error "Failed to init cut-off branch '$cutoff'."
}

rebase() {
    local cutoff="$1"
    local current="$2"
    local current_hash
    local errcode

    log "Rebasing commits '$LAST_BASE..$current' onto cut-off branch '$cutoff' ..."
    current_hash="`git show -s --pretty=%H "$current"`" &&
    git rebase --rebase-merges --onto "$cutoff" "$LAST_BASE" "$current_hash" || {
        errcode=$?
        log "STARTING INTERACTIVE SHELL TO RESOLVE REBASE."
        log "Use 'git rebase --continue' after resolving the issue e.g. with 'git mergetool'."
        log "Do not forget to exit this shell to continue the script."
        $SHELL
        if test -e "`git rev-parse --git-dir`/rebase-merge"; then
            git rebase --abort 2>/dev/null
            git checkout --quiet "$current"
            ret $errcode
            error "Failed to transfer commits '$LAST_BASE..$current' to '$cutoff'."
        fi
    } &&
    git rebase --rebase-merges HEAD "$cutoff" &&
    git checkout --quiet "$current" &&
    git update-ref "$LAST_BASE" "$current" &&
    log "Cut-off branch '$cutoff' updated." &&
    true
}


prepare "$CUTOFF" "$CURRENT" &&
rebase "$CUTOFF" "$CURRENT" &&
true

Use this if you want to push the result to a remote repository:

$ git push <remote> cutoff:<name-of-cutoff-on-remote>
doak
  • 809
  • 9
  • 24