0

I am a git user and my context here is git clone using ssh protocol.

When I use git operations using SSH protocol, I have the following questions.

  1. The git clone command for ex: git clone ssh://git@<serverurl>:7999/text/large_files.git During this clone operations, the SSH request is stateful or stateless.

  2. I have enabled git LFS to fetch large files but LFS file transfer is over https. if my git clone involves LFS files then my git clone command invokes https requests. in this case, the main session git clone ssh://... is still active and maintains session state info or not?

rgh
  • 40
  • 2
  • 11
  • One state that comes to mind is that an SSH connection depends on the underlying TCP/IP connection, which has state, so by that logic SSH trivially is stateful. – Joachim Sauer Nov 22 '19 at 08:26

1 Answers1

2

As Joachim Sauer points out in a comment, SSH itself is obviously (trivially) stateful. But this doesn't matter. Your question assumes that Git-LFS performs its special tricks during the clone operation. It does not do this.

The way Git-LFS works is that it replaces the stored Git object data (the blob data) for "large" files with data that Git-LFS use to access the original file data from a third-party location (via, as you noted, https). This means that Git itself never sees the third party location. Git has no idea that this substitution is going on.

The actual substitution takes place at two points:

  1. When a file is copied from Git's index to the work-tree: Git-LFS uses a smudge filter to replace the access data (which is all Git has stored) with the real data. Git never sees the real data: it's only ever in the work-tree, and in the third-party location. Git sees only the substitute access information that the Git-LFS clean filter produces (see step 2 below).

  2. When a file is copied from the work-tree to the index: Git-LFS uses a clean filter to send the actual data to the third-party location, and replace it with substitute access information. That's why Git never sees the real data: it never goes into the index, and Git makes new commits from the index.

Step 1 occurs whenever Git copies a file to the work-tree through the index, via git checkout. Step 2 occurs whenever you tell Git to copy the work-tree file back into the index, via git add. There are a few more corner cases that invoke steps 1 and/or 2, but these are the main two.

Hence, when you are doing a git clone operation, all you are transferring across the SSH connection is the Git data: the commit, tree, blob, and annotated-tag objects that go into the Git repository, and the other data that Git uses to keep track of these objects, as encoded in the so-called smart protocol. This connection gets closed before Git runs git checkout.

Having closed the ssh connection, Git now runs git checkout. This runs the smudge filter on any Git-LFS files; the smudge filter in Git-LFS opens https connections to the third-party storage facility at this point, when necessary.

torek
  • 448,244
  • 59
  • 642
  • 775