0

For the past several months, I've been working on a massive upgrade project on an 11-year old application that consists of well over 3,500 individual files. At one point in time, the files were copied (they were being managed by SVN, then ...), and conversion work began, in parallel to continuing work in support of the customer.

Within the conversion repository (which is entirely unrelated to the "other" git repository which supplanted SVN), about 314 commits have been done, and some of these are gigantic. (Conversion of <? to <?php, replacement of mysql_ calls with calls to an interface library, and so on.)

Now, the task at hand is to bring the about-120 files that have changed in the "other" repo (which eventually is to be abandoned ...) into this one. My approach so far has been to create a branch, copy the files into that new branch, and to re-apply "basic" changes such as the foregoing, using automatic code-analysis tools which I have developed for that purpose.

And here is where I am uncertain what to do next. I want to re-make the changes that I made to those files, as reflected in the 300-odd commits now on the main branch of my conversion repo, and to do so as automatically as possible. I have a file which contains a list of all the files in question. My thought is to cherry-pick some of the older commits out of the main branch, and to apply these to the files in the new branch (which might never be merged into the master). To my way of thinking, only those commits which touch any of those files need to be reapplied. (But, some of those commits touched thousands of files, including but not limited to the ones in play here.)

At this point, I'm standing on the cusp of a decision, having not yet done anything, and not quite certain how best to proceed.

Remember: there are two separate git repos, but they are entirely unrelated to another. (The one used for production maintenance didn't even exist at that time.) So, I can use it ... and, did use it ... only to obtain a list of the files that have been touched, and to obtain their most-recent version. When the conversion project is finished, the conversion-repo will be discarded, and the present production-repo will be frozen. An entirely new repo will be created with which to move forward.

Advice earnestly sought . . .

EDIT: I have since considered a completely different approach, which would abandon the course that I started-on, throw away that branch entirely, and pursue a different strategy of going through the old repo, grabbing selected commits as patches, and trying to apply those patches to the existing (albeit, possibly very-changed) modules. Or, if need be, doing the same thing by hand. Only about 100, give-or-take, commits to do... Comments are cordially (and, earnestly) requested about either strategy.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41

1 Answers1

0

I'm not sure if this is exactly what you are looking for, but my suggestion would be using Gitflow Workflow.

The way this works, you have a main branch called master where the main code exists. This code is always the latest stable code for the project.

Alongside this branch is a develop branch. This branch holds all of the developmental progress. Additional features can be branched off of develop and brought back at a later time when they are finished. Hotfixes can also be branched from develop. For your situation, you could branch off of the develop branch for each of your migrations, that is, bringing in one feature at a time.

When the develop branch is stable and it is time for a release, you can then merge the develop branch with master. I have used this type of workflow for some of my larger projects, and it has worked out very well for me.

Below is a diagram that shows how this type of workflow might look. The circles are commits color-coded by the legend. If you have any questions or further need for clarification from me, please don't hesitate to let me know.

gitflow

Dylan Wheeler
  • 6,928
  • 14
  • 56
  • 80
  • The problem here is that, while I'd like to do this on this project and I eventually will, *right now* I'm dealing with a much messier, albeit interim, situation. There are two repos ... UNRELATED repos. (The second didn't yet exist as a repo when the first was made, 'cuz it took time for the powers that be to realize how cool git is.) ;-) Also, there are literally thousands of changes that have been made to each file. – Mike Robinson Jun 24 '16 at 15:20
  • One thing I'm seriously thinking about, as an alternative to pretty-much everything that I described here, is to LOOK AT each of the 120-odd changes that have been committed in the now-production repo, and "do the same thing." If necessary, "by hand." If I focus only on the commits which were either made directly on the main branch, or that were merged into that branch, that actually looks like a fairly reasonable number of changes ... *patches* ... I would, in other words, abandon my technique of copying files and re-doing changes. – Mike Robinson Jun 24 '16 at 15:29
  • I welcome and request ANY and ALL thoughts about the above idea. Is it "hair-brained," or not? – Mike Robinson Jun 24 '16 at 15:29
  • (By the way, Confiqure, thanks for your comment! These are very good ideas which i will, indeed, employ. Just, I can't do them yet.) – Mike Robinson Jun 24 '16 at 15:30