0

How can we use go-git to generate a list of all the files that have changed between two commits similar to git diff --name-only commit1 commit2?

For context, we have a git monorepo that contains a single root go.mod file but multiple Go applications. When developers push commits to a branch, we would like to get a list of all files that changed between two git commits (git diff --name-only), and filter it down into a list of application directories while excluding some directories. Our ultimate goal is so we can build, deploy, and test just the applications that have changed inside our monorepo. We have a bash script similar to this one from shippable that does this, but we'd like to use pure go and go-git.

Community
  • 1
  • 1
SteveCoffman
  • 983
  • 1
  • 8
  • 22
  • You might need to build it yourself using [Tree.Diff](https://godoc.org/gopkg.in/src-d/go-git.v4/plumbing/object#Tree.Diff). – mkrieger1 Nov 18 '19 at 11:23
  • I've updated your title to match the new question, too. – Jonathan Hall Nov 18 '19 at 11:31
  • TBH, having one repo for multiple applications sounds odd to me in the first place. Have multiple repos each with their own CI, and if you absolutely must, have a "parent" repo which contains the other repos as [submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules). Additionally, given that the Go compiler is explicitly optimised for compilation speed (and I rarely have seen any Go program that takes longer than a few seconds to compile), I fail to see the advantage. Like in "at all". – Markus W Mahlberg Nov 19 '19 at 08:30
  • TBH, I agree with you @MarkusWMahlberg but I'm tasked with making the monorepo work, not with deciding whether to do it. – SteveCoffman Nov 19 '19 at 14:29

1 Answers1

2

It appears that change.Files() gives only the names of the files with to.Name, without the paths inside the repository, however the change.toString() gives the full path.

So if you want to use Tree.Diff, you have to get the paths like this:

func getChangeName(change *object.Change) string {
        var empty = object.ChangeEntry{}
        if change.From != empty {
            return change.From.Name
        }

        return change.To.Name
}

So with that, it looks like you can choose either Tree.Diff or Patch.Stats depending on your needs:

    currentTree, err := commit.Tree()
    CheckIfError(err)

    prevTree, err := prevCommit.Tree()
    CheckIfError(err)

    patch, err := currentTree.Patch(prevTree)
    CheckIfError(err)
    fmt.Println("----- Patch Stats ------")

    var changedFiles []string
    for _, fileStat := range patch.Stats() {
        fmt.Println(fileStat.Name)
        changedFiles = append(changedFiles,fileStat.Name)
    }

    changes, err := currentTree.Diff(prevTree)
    CheckIfError(err)
    fmt.Println("----- Changes -----")
    for _, change := range changes {
        // Ignore deleted files
        action, err := change.Action()
        CheckIfError(err)
        if action == merkletrie.Delete {
            //fmt.Println("Skipping delete")
            continue
        }
        // Get list of involved files
        name := getChangeName(change)
        fmt.Println(name)
    }

Patch.Stats will skip binary files, where Tree.Diff will let you ignore deletions.

SteveCoffman
  • 983
  • 1
  • 8
  • 22