Not entirely sure if you still care but I found your question and needed the answer myself. Ended up figuring it out by reading some rather dry documentation.
from pygit2 import init_repository, Patch
from colorama import Fore
git_repo = init_repository(repo_path)
diff = git_repo.diff(commit_a, commit_b, context_lines=0, interhunk_lines=0)
# A diff contains Patches or Diffs. We care about patches (changes in a file)
for obj in diff:
if type(obj) == Patch:
print(f"Found a patch for file {obj.delta.new_file.path}")
# A hunk is the change in a file, plus some lines surounding the change. This allows merging etc. in Git.
# https://www.gnu.org/software/diffutils/manual/html_node/Hunks.html
for hunk in obj.hunks:
for line in hunk.lines:
# The new_lineno represents the new location of the line after the patch. If it's -1, the line has been deleted.
if line.new_lineno == -1:
print(f"[{Fore.RED}removal line {line.old_lineno}{Fore.RESET}] {line.content.strip()}")
# Similarly, if a line did not previously have a place in the file, it's been added fresh.
if line.old_lineno == -1:
print(f"[{Fore.GREEN}addition line {line.new_lineno}{Fore.RESET}] {line.content.strip()}")
As you can see, a diff can contain multiple Patches and Diffs. Because of this, we need to loop over them. The Diff object behaves as a collection (this is not really clear from the documentation). Patches contain the info we need. The actual lines changed can be found in the Hunks. This is a term from the GNU diff utils documentation and describes the changes + some context.