How to keep a local git mirror up-to-date without polling?

Question

The scenario: Our development team uses Jenkins for continuous integration, and some of our code is open source and therefore hosted at GitHub.com.

We have a local mirror of the relevant GitHub repositories, and our local GitBlit server is set to periodically poll the GitHub repositories to update the local mirror.

This "sort of works"; but the problem scenario is this:

A developer realizes he needs to make a change to the open-source codebase, so he pushes the change to the GitHub repository, and also updates the submodules in our closed-source Git repositories to point to the new revision.
The local developer then triggers an autobuild on Jenkins so he can test/verify that the changes work on all platforms
The Jenkins autobuild fails spectacularly because the local-mirror of the GitHub repository hasn't yet been updated to reflect the original, so when Jenkins tries to update the submodules in its various workspaces, the local-mirror doesn't recognize the revision ID that the closed-source git repositories are pointed at.

Our current work-around for this problem is to set GitBlit to poll GitHub more often, but I don't like that as a solution since it causes more periodic/unnecessary traffic across the Internet, and still doesn't entirely avoid the potential for build failures, e.g. in the case where a developer pushes changes and then triggers a build immediately afterwards.

Is there a known "best-practice" solution for this problem that would automatically give us reliable Jenkins-build-behavior and also avoid constantly polling GitHub?

score 2 · Accepted Answer · answered Nov 19 '18 at 20:31

2

You can use a Github webhook to notify your local infrastructure about following events:

A repository is pushed to

A pull request is opened

A GitHub Pages site is built

A new member is added to a team

Do note that this will minimize delay however in some cases e.g. network problems or Github infrastructure partial outage it can still fail your build.

Setting Jenkins autobuild to update the local mirror before the build is probably the only safe solution.

answered Nov 19 '18 at 20:31

Karol Dowbecki

43,645
9
78
111

2

But he can't do that if they don't own the upstream GitHub repository. – mkasberg Nov 19 '18 at 20:33
In this case we do own the upstream GitHub repository (although it's also good to know about solutions that could be applied in cases where we don't own it, just for future reference) – Jeremy Friesner Nov 19 '18 at 20:56

score 0 · Answer 2 · answered Nov 19 '18 at 20:42

0

I think the best solution here is to use a real Git repository mirror rather than trying to roll your own. Without access to webhooks (assuming you don't own the GitHub repository), the best you can do is polling.

There are open source solutions available (Artifactory and Nexus come to mind) that can mirror a Git repository and provide caching functionality. I think you'll find that these mirrors are much more reliable than a script that updates on a certain interval. Moreover, I think they can do things like run a quick hash verification against the upstream repo when the user tries to pull, so they know if they are out of date (and will immediately update to provide the correct version).

answered Nov 19 '18 at 20:42

mkasberg

16,022
3
42
46

"git clone --mirror" doesn't count as a real mirror? – Jeremy Friesner Nov 19 '18 at 20:57
1

As far as I can tell, `git clone --mirror` does nothing to keep the mirror up-to-date; it just configures the repository to behave in the way a mirror would be expected to. There's certainly nothing wrong with scripting that, but I also think there's no obvious solution to the problems you describe if you go that route. I'd expect that some of the technologies I was referring to use `git clone --mirror` internally, but provide the additional functionality I talked about. – mkasberg Nov 19 '18 at 21:18

How to keep a local git mirror up-to-date without polling?

2 Answers2