9

Is it possible to reconstruct a word-by-word history in version control? Ideally, I'd like to do is something like 1) I indicate the range of lines of interest, 2) have the program figure out the corresponding line numbers in previous versions, as code often moved up or down between versions (potentially limiting the range of versions, say since revision 19, or since a week ago), 3) print out a word-by-word history, either the versions that groups of words were last changed, or the authors by which groups of words were changed. So it's kind of like svn blame or git blame in a word-by-word level.

Failing that, are there tools that can do #1 and #2 above? That is, 1) I indicate the range of lines of interest, 2) have the program figure out the corresponding line numbers in previous versions, 3) the program would print out the history of these lines (when there were changes).

Either svn or git would be really helpful for me.

ceiling cat
  • 5,501
  • 9
  • 38
  • 51
  • 1
    word by word! i don't think so, git tracks changes line by line. Are you trying to use git for writers? for programmers i don't think this level of blame is not required. – egghese Jul 20 '13 at 04:27
  • Yes, I am trying to do this on a LaTex document, at least at the moment. It doesn't need to be something built-in to git though. I imagine a program external to git that can read git history can does this too. – ceiling cat Jul 20 '13 at 07:01
  • 3
    @JeslyVarghese: Git tracks changes snapshot by snapshot. The line-based format is calculated on-the-fly, and it would also be possible to have a word-based format. – nosid Jul 20 '13 at 07:34

2 Answers2

2

I looked for something like this and ended up hacking up my own solution. You can find it here:

https://github.com/d33tah/wordblame

Basically, it creates a new repository directory in which all spaces are replaced by a newline and unique string signalling that there was a space. Then, "git blame" is executed and the result is reinterpreted.

d33tah
  • 10,999
  • 13
  • 68
  • 158
  • FYI I've ported your code to Python 3 since I got some encoding errors with Python 2: https://github.com/mdamien/wordblame/commit/7fd67de54653893015b9cc724905b01c4fc3d341 – damio Feb 24 '19 at 00:39
0

I've made a tool called git-word-blame to solve this exact problem:

# setup
> virtualenv -p python3 venv
> source venv/bin/activate
> pip install git-word-blame

# usage
> git word-blame your-file
> firefox /tmp/git-word-blame/word-blame-by-commit.html

I should look like this:

git word-blame screenshot

EDIT: Here's also another project from 2016 trying to character-based blame: https://cregit.linuxsources.org/

damio
  • 6,041
  • 3
  • 39
  • 58