5

Background:

We are currently using git for source management in a web app I am working on. There is an editor, and so there is also a web interface to git.

One of our use cases is that people can ALSO manage their git repositories from the command line, so the web interface needs to be able to handle, in some way, any weird state it finds the repository in.

Question:

For testing, it would be great to get a git repository with a file in every possible state, so I could verify that all possible conditions are handled. Reading "man git-status(1)" I counted possible 24 states (not counting ignored) that a file might be in.

I have only figured out how to create 17 of these states.

Here are the XY codes (see git-status), of the states I do not know how to reproduce.

D           M    deleted from index
C        [ MD]   copied in index
D           D    unmerged, both deleted
A           U    unmerged, added by us
U           A    unmerged, added by them

There is a gist on github with a ruby script which creates every state I already know how to reproduce, I would love to make that complete.

mtoy
  • 583
  • 1
  • 5
  • 13
  • 3
    Don't bother with an actual repository in your test code; just mock out the call to `git status` to produce the output you want your code to handle. – chepner Apr 01 '14 at 17:40
  • This is a good point. In my case, the question isn't "can I parse the output of git-status", but "do offer the users the right options", and understanding what causes a state is part of understanding what choices they should have, plus there is a mini-OCD thing too :) – mtoy Apr 01 '14 at 19:17
  • ... and I have to add, actually reproducing the states in a git repository did introduce a state that mocking git-status would not have. In attempting to create "DM", I found a case where git-status returns TWO entries for a file, ( one "??" and and "D " ) – mtoy Apr 01 '14 at 19:20
  • ... sorry to go on and on, @chepner makes a perfectly good suggestion, the other reason why it is useful to generate every possible state is because I am switching back and forth between JGIT and command line git, and one question is "what does JGIT return for each possible state" – mtoy Apr 01 '14 at 20:36

3 Answers3

3

By posting this question to the git mailing list, I got an answer for four of the seven codes, with the investigation done by @torek there is now a complete answer to this question, though I don't quite know how to reflect that in stackoverflow.

You can read the git mailing list discussion for details, but the short answer is, with the information provided by @torek, none of the status combinations which I cannot reproduce are reachable by normal users of the git command line tools.

mtoy
  • 583
  • 1
  • 5
  • 13
  • 1
    Edit this into torek's question or supply it as a comment. Accept his answer. – pmr Apr 02 '14 at 00:07
2

For what it's worth, based on the source, there's no way to get "copied in index" to occur today:

wt-status.c:            status = d->index_status;
wt-status.c:            if (!d->index_status)
wt-status.c:                    d->index_status = p->status;
wt-status.c:                    d->index_status = DIFF_STATUS_UNMERGED;
wt-status.c:                    d->index_status = DIFF_STATUS_ADDED;
wt-status.c:            if (!d->index_status ||
wt-status.c:                d->index_status == DIFF_STATUS_UNMERGED)
wt-status.c:    if (d->index_status)
wt-status.c:            color_fprintf(s->fp, color(WT_STATUS_UPDATED, s), "%c", 
wt-status.h:    int index_status;

(where index_status is the letter printed for the first column). So the direct assignments can set it to U and A, and the copy-assignment from p->status can set it to whatever p->status is set to. That is ultimately controlled via this bit of code:

static void wt_status_collect_changes_index(struct wt_status *s)
{
        struct rev_info rev;
        struct setup_revision_opt opt;

        init_revisions(&rev, NULL);
        memset(&opt, 0, sizeof(opt));
        opt.def = s->is_initial ? EMPTY_TREE_SHA1_HEX : s->reference;
        setup_revisions(0, NULL, &rev, &opt);

        if (s->ignore_submodule_arg) {
                DIFF_OPT_SET(&rev.diffopt, OVERRIDE_SUBMODULE_CONFIG);
                handle_ignore_submodules_arg(&rev.diffopt, s->ignore_submodule_arg);
        }

        rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
        rev.diffopt.format_callback = wt_status_collect_updated_cb;
        rev.diffopt.format_callback_data = s;
        rev.diffopt.detect_rename = 1;
        rev.diffopt.rename_limit = 200;
        rev.diffopt.break_opt = 0;
        copy_pathspec(&rev.prune_data, &s->pathspec);
        run_diff_index(&rev, 1);
}

The diff options here are those shown above: detect_rename is set to DIFF_DETECT_RENAME (1 -- this should use the #define, really), with a limit of 200. If detect_rename had been set to DIFF_DETECT_COPY (2), you could get state C.

I tested this by modifying wt-status.c (see below), then fussing about with another file:

$ git status --short
 M wt-status.c
$ git mv zlib.c zzy.c; cp zzy.c zzz.c; git add zzz.c; git status --short
 M wt-status.c
R  zlib.c -> zzy.c
A  zzz.c
$ ./git-status --short
 M wt-status.c
C  zlib.c -> zzy.c
R  zlib.c -> zzz.c

note that the equivalent of --find-copies-harder is still not set, so you have to have at least one rename already to get the copied status:

$ git mv zzy.c zlib.c; ./git-status --short
 M wt-status.c
A  zzz.c

To get that I had to also add another DIFF_OPT_SET:

$ git diff
diff --git a/wt-status.c b/wt-status.c
index 4e55810..06310e3 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -494,7 +494,8 @@ static void wt_status_collect_changes_index(struct wt_status
        rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
        rev.diffopt.format_callback = wt_status_collect_updated_cb;
        rev.diffopt.format_callback_data = s;
-       rev.diffopt.detect_rename = 1;
+       rev.diffopt.detect_rename = DIFF_DETECT_COPY;
+       DIFF_OPT_SET(&rev.diffopt, FIND_COPIES_HARDER);
        rev.diffopt.rename_limit = 200;
        rev.diffopt.break_opt = 0;
        copy_pathspec(&rev.prune_data, &s->pathspec);
$ ./git-status --short
 M wt-status.c
C  zlib.c -> zzz.c

I'll stop at this point though, that's enough playing for now :-)

Chris

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    This is awesome. There is a [git mailing list discussion](http://git.661346.n2.nabble.com/git-status-trying-to-understand-all-possible-states-tp7607202.html) which provides an explanation of the other four codes, so I am calling this one answered. You can follow the link for details, but the short answer is, with the information provided by torek, none of the status combinations which I cannot reproduce are reachable by normal users of the git command line tools. – mtoy Apr 03 '14 at 00:14
0

I was able to reproduce these by causing a rename conflict. i.e. merge two branches that renamed the same file but to different names.

D           D    unmerged, both deleted
A           U    unmerged, added by us
U           A    unmerged, added by them
11th Hour Worker
  • 337
  • 3
  • 14