Hook Type to Exclude File Types from Git Repository

Question

I am trying to write a git hook that will prevent files such as .exe files from being pushed to repositories.

We use Gitlab for our source control and this hook will function in place of the Gitlab 'Prohibited file names' push rule.

My question is: which type of hook should I be looking to write?

I have gone down the road of writing a pre-receive hook. Does this sound right?

Thanks,

Sean

Why are you looking for a hook? Why aren't you using git ignore? — kabanus, Dec 11 '18 at 13:38
Do you mean that you are going to keep .exe file commits history in repository or you want totally remove them from git index? If remove than you can just use .gitignore file. — bananaspy, Dec 11 '18 at 13:39
@kabanus you don't get to control what other people have in their repositories, only what's allowed in yours, so you need pre-receives to vet what people push to your repos. — jthill, Dec 11 '18 at 16:55
@jthill I agree, but it sounds like a standard don't want binaries tracked issue. Sound like the team agrees on what's prohibited here, and it seems gitignore would work, but of course I cannot be sure until OP answers. Pre receive also seems weird, from the phrasing it seems like they're trying to prevent the push. Anyway, let's wait for clarification. — kabanus, Dec 11 '18 at 19:10
@jthill just read the new answer. Server side enforcing makes sense, though if it's a collaboration I still would prefer agreement on Wednesday should be tracked. Interesting option though. — kabanus, Dec 11 '18 at 19:14

ack_inc · Answer 1 · 2018-12-11T14:08:07.070

2

The best way to exclude files from being checked-in into source control is through .gitignore

However, if you really want to write a hook, the most appropriate one to use in this situation (i.e. to do something before pushing to a remote) is the 'pre-push' hook.

edited Dec 11 '18 at 14:08

answered Dec 11 '18 at 13:36

ack_inc

1,015
7
13

The question is: "... which type of hook should I be looking to write?" Could you explain why mine is not an answer? – ack_inc Dec 11 '18 at 14:00
Good point, I will delete my comment. I would rephrase this not as a suggestion but as an answer so others won't be confused as well. Consider also mentioning `.gitignore`, as it seems OP may be unfamiliar. – kabanus Dec 11 '18 at 14:04

score 2 · Answer 2 · answered Dec 11 '18 at 16:35

On the server, you have two options, namely the pre-receive and update hooks. (For this particular case I'd probably use an update hook myself.)

The pre-receive hook is invoked once, with standard input connected to a pipe containing all the proposed reference updates. You should read all stdin lines, and use the old and new hash IDs and the names of all the references to decide whether the entire push is allowed to proceed, or the entire push—all name-updates—is to be rejected all at once. That is, given that some client has run:

git push origin hash1:name1 hash2:name2 ... hashN:nameN

so that there are N update requests on N lines of stdin, your pre-receive hook either accepts all, or rejects all. To accept all, exit with status zero; to forbid all, exit with any nonzero status. It's a good idea to print the reason for the rejection, if you exit nonzero, so that the client will see why you did this.

The update hook is invoked once per proposed update, after the pre-receive hook (if there is one) has allowed the entire process to enter the second phase. It has three positional parameters giving the same information that came in on one of the input lines to the pre-receive hook. You should examine the two hash IDs and the name, and decide whether this particular update is allowed.

That is, given the same client invocation, your update hook will be invoked N separate times. The second one will, for instance, have:

$1: refs/heads/name2
$2: the old hash ID (or the all-zeros "null hash")
$3: the new hash ID (or all-zeros)

If you're willing to have name2 set to point to the new hash ID, have your update hook exit with a zero (success) status. If not, have it exit with a nonzero status. Again, it's a good idea to print something if you are going to reject the update.

About server-side hooks in general

Your hooks receives, per reference, an old hash ID ($old below), a new one ($new), and the full name of the reference: refs/heads/name if the reference is a branch name, refs/tags/name if it's a tag name, refs/notes/name if it's a notes reference, and so on. An update hook has finer granularity, but cannot see the proposed update as a whole.

At most one of $old or $new will be all-zeros. If so, the reference is to be created—e.g., a new branch or tag—or destroyed. Otherwise, it's an in-place update: the reference currently points to hash ID $old and the person running git push is proposing to change it to point to hash ID $new.

These hooks are highly effective, but difficult to write. In particular, if a reference is being updated it's pretty clear what to do: the update will add commits in the $old..$new range, so:

git rev-list $old..$new | while read rev; do
    # examine the files in $rev
done

suffices to allow you to inspect the contents of each proposed new commit. (Some commits may be being deleted and those can be found by inspecting $new..$old.)

However, if the reference is newly created, $old will be all-zeros. It's impossible to tell for certain which references are newly introduced solely by this particular reference. You can use this trick:

git rev-list $new --not --all

to enumerate commits reachable from the proposed new reference, but not from any current reference. That could be misleading, though: perhaps the push is a request to create three new branch names:

...--o--o--o   <-- master
            \
             A--B   <-- newbranch1
                 \
                  C   <-- newbranch2
                   \
                    D--E--F   <-- newbranch3

Taken in isolation, the request to set newbranch3 to point to a commit whose hash ID is F looks like a request to add all six commits (which it is!) but you might prefer to view it as a request to add just three commits after the otherwise-added branch newbranch2, for instance. It's not possible to produce this view in an update hook. It is possible (but hard) to produce it in the pre-receive hook as it can tell that all three newbranch* names are new.

Thanks for this answer, very illuminating. From your experience, is server side enforcing preferable to a team agreed workflow/ ignore file? — kabanus, Dec 11 '18 at 19:15
Where I've worked, we've sometimes done some of this server-side and sometimes client-side. It's mostly a matter of taste and team abilities. Server side has the advantage that it's 100% enforced, and has the disadvantage that it's 100% enforced. :-) — torek, Dec 11 '18 at 19:20

Hook Type to Exclude File Types from Git Repository

2 Answers2

About server-side hooks in general