0

I'm doing a diffstat on my merge to see how many inserts, deletes and modifications I made like so:

git show 526162eed6 --first-parent --unified=0 | diffstat -m

This lists all the files and gives a summary at the end:

a/b/c | 10 ++++++++++
a/b/d |  5 +++++
...
10 files changed, 50 insertions(+), 10 modification(!)

However, I'd like to see all values even if they were zero:

10 files changed, 50 insertions(+), 0 deletions(-), 10 modifications(!)

How can I do this? The current workaround I have is to output a CSV via ... | diffstat -mt and manually add up the columns via awk. Is there a simpler way?

Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
PhD
  • 11,202
  • 14
  • 64
  • 112
  • Not an answer, because it suffers the same "problem", but why pipe to an external program and not simply run `git diff --stat` (or `git show --stat`)? There's also `--short-stat` – knittl Jul 10 '21 at 05:23
  • @knittl: That doesn’t give you modifications AFAIK. Only insertions and deletions. Hence `diffstat`. Unless there’s a flag that you know can help? – PhD Jul 10 '21 at 05:49
  • I see. The modification metric is a heuristic, I wouldn't rely too much on it. The diff output only contains insertions (+) and deletions (-) (but that's not your question :)) – knittl Jul 10 '21 at 06:31
  • I agree. It’s a good approximation though that can serve as a lower bound to capture the notion of a modification not too far from the truth. – PhD Jul 10 '21 at 07:05
  • btw, you know about the `--stat` option on anything that uses git's diff core? `git show --stat`? Still leaves you with the same problem and all, just easier to get there. – jthill Jul 10 '21 at 20:10

1 Answers1

1

I couldn't find an option to do what you want. diffstat is a tool producing human readable output, not intended for machine consumption.

If you absolutely must parse/massage its output, you could use a very dirty hack (not recommended, can break anytime). Define shell functions:

stats() {
  read -r stat
  echo "$stat" | grep -o '[0-9]\+ file' | grep -o '[0-9]\+' || echo '0'
  echo 'files changed,' # does not match original output for 1 file
  echo "$stat" | grep -o '[0-9]\+ ins' | grep -o '[0-9]\+' || echo '0'
  echo 'insertions(+),'
  echo "$stat" | grep -o '[0-9]\+ del' | grep -o '[0-9]\+' || echo '0'
  echo 'deletions(-),'
  echo "$stat" | grep -o '[0-9]\+ mod' | grep -o '[0-9]\+' || echo '0'
  echo 'modifications(!)'
}
diffstats() {
  diffstat -sm | stats | paste -sd ' '
}

and then:

git diff | diffstats
knittl
  • 246,190
  • 53
  • 318
  • 364
  • This is close to what I was thinking. Given the brittleness I suggested the CSV/awk tabulation _hack_. That may be much less brittle. – PhD Jul 10 '21 at 07:07
  • 1
    @PhD parsing the CSV and manually summing the numbers is probably the best option then. My solution probably isn't even simpler than using awk to sum 4 columns – knittl Jul 10 '21 at 08:51