2

Just started using github in-depths so im' still learning. I want to use python GitHub api (currently i'm using PyGithub) to create new branch, commit, issue pull request automatically. Couples of things that are confusing me... Any help would be greatly appreciated.

  1. when i create a new branch using the api, it requires a sha/hash value. is there any doc/guideline on how i should computer this hash value? can it be just any hash? I noticed that when i create a new branch on github.com it doesn't require user to specify a hash value, i'm guessing the web site is doing it for you, so is the generation based on something?

  2. still studying but what i gathered from the official git page, is that branch is just an alias for the hash value. while trying to figure out question #1, i tried creating two branches with same hashes, it works, and afaik all commits are going to the right branch so its doing the right thing. But since the two branch have the same hash value, should the commits go to both branches?

Thanks, K

larsks
  • 277,717
  • 41
  • 399
  • 399
Trouble
  • 51
  • 3

2 Answers2

2

If you are using https://github.com/PyGithub/PyGithub, you can create a branch (as in this test) with Repository.create_git_ref:

ref = self.repo.create_git_ref("refs/heads/BranchCreatedByPyGithub", "4303c5b90e2216d927155e9609436ccb8984c495")

A branch is generally created from another branch, which means you should call Repository.get_git_ref first, using the name of the branch from which you want to start: that will give you the SHA1 to use with create_git_ref.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Thanks for the reply. Yes, that works, and that led to my #2. When i create a new branch using an existing one, they share the same hash. Wouldn't there be collision there of some sort? – Trouble Jan 11 '19 at 15:11
  • @Trouble no collision: the second branch is ready to accept new commit. See figure 3-4 (and the text below) in https://git-scm.com/book/en/v1/Git-Branching-What-a-Branch-Is – VonC Jan 11 '19 at 15:44
1

The key item you're getting to here—the source of your question—is that branch names don't mean much at all, in Git. They are just moveable pointers that, by definition, point to the last commit in a branch. Multiple names can point to any single commit.

In Git, it's the commits that matter. The commit is Git's raison d'être. Commits acquire a hash ID simply by being created, because, like all of Git's four object types, the hash ID is the cryptographic checksum of the commit's content. Since each commit is unique—it has a timestamp to help out, in case everything else about the commit is the same as some earlier one—each commit acquires a new, unique hash ID.

However, commit hash IDs are seemingly random and impossible for humans to remember or work with. So we need some way to name the latest commit that we want to remember. That way is, generally, with a branch name. Once we have a commit, we can point any number of branch names at it.

Each commit remembers its parent or parent hash IDs, so we only need to remember the last, or tip, commit of the branch—all the earlier ones are find-able by starting at the end and working backwards. So the branch name identifies the tip commit, only.

When Git creates a new commit, Git simply writes the new commit's hash ID into the current branch. Which branch is the current branch? The answer to that is equally simple: the special name HEAD holds the name of the current branch.

It's important to make sure that any useful Git commit is reachable by some name, because Git will eventually garbage collect any unreachable commits. That is, if the name xyz identifies commit a123456..., that commit is protected from the garbage collector. So it that commit's parent (or parents), and grandparents, and so on. Git give you some time (14 days, by default) to hook things up so that objects, including commits, are protected through this reachability idea: you first create an object, such as a blob or tree or commit, then update any name(s) needed to be able to find the object and any of its ancestry. The 14 day window is your grace period to complete the name-updating, after creating the object(s).

torek
  • 448,244
  • 59
  • 642
  • 775