1

I ran a search and didn't come up with anything; so, decided to see what the community has to offer on this score.

When you commit to GitHub it performs a diff. The diff tells you how many lines were added and how many lines were removed.

From a metrics perspective, this doesn't impress me much considering a lot of the languages I use aren't dependent on whitespace characters to function. Therefore, an entire class with 50 methods can be defined in a single line (not that you would do that, of course).

Measuring lines can lead us to believe that single-line solutions are inherently better than the alternative.

So, I'm curious if there's a way to have GitHub (or Git in general) display the difference in non-whitespace character count.

Example:

class Something
{
    function hello()
    {
    }
}

Changed to:

class Something {
    function hello() {
    }
}

Would result in something like this:

Line change: -2 Character change: 0

Josh Bruce
  • 1,002
  • 13
  • 24

2 Answers2

0

To Find Line Change use below Command (Subtract 1 from final result shown)

git diff abc.txt | grep  "+" |grep -c -v "@@" 

Output: 3

git diff abc.txt

Output:

diff --git a/abc.txt b/abc.txt
index 9ab6740..c2ab3e3 100644
--- a/abc.txt
+++ b/abc.txt
@@ -1,5 +1,5 @@
 my
-qq
+qq1
 wq
 my
 q
@@ -8,7 +8,7 @@ q
 q
 q
 q
-q
+q1
 q
 q
Rishabh Dugar
  • 606
  • 5
  • 16
  • 1
    I don't want to downvote this, but I don't think the desired outcome was achieved. Line count is not the issue. Character count is the issue. I don't see in this answer where the character count is displayed and what change there was. Am I missing it? – Josh Bruce Nov 07 '17 at 06:54
  • Sure , but i tried this and i dont think there is a direct way for second part of your question, only way if you write some custom program which identifies this – Rishabh Dugar Nov 07 '17 at 07:41
0

I had kind of the same problem and tried to solve it using the --word-diff parameter for git diff.

tl;dr

echo $(git diff --cached --word-diff --color | grep -oP '\x1b\[[0-9;]*m\{\+([^\x1b]*)\+\}\x1b' | cut -d+ -f 2- | rev | cut -d+ -f 2- | tr -d '\n' | wc -c) - $(git diff --cached --word-diff --color | grep -oP '\x1b\[[0-9;]*m\[\-([^\x1b]*)\-\]\x1b' | cut -d- -f 2- | rev | cut -d- -f 2- | tr -d '\n' | wc -c) | bc


Some explanation:

  • git diff --cached --word-diff --color
    Prints the word diff of the currently staged against the HEAD.
  • grep -oP '\x1b\[[0-9;]*m\{\+([^\x1b]*)\+\}\x1b'
    Filters for all added content (also if the colors are changed in git config).
  • cut -d+ -f 2- | rev | cut -d+ -f 2-
    Complicated version of filtering the content between the surrounding {+ and +}, I couldn't come up with a reliable alternative version.
  • tr -d '\n' | wc -c
    Counts the characters without newlines.

The second part is actually quite the same but with matching [- and -] instead of {+ and +} for the deleted content. After all echo joins the two numbers with a - such that bc can calculate the difference.

One additional comment: I opted for colorized output, since this makes pattern matching much easier (in cases where you added characters that are used to match start or end, like +}, [-, ...)

Here is basically the same logic written in Python:

from subprocess import Popen, PIPE
import re

process = Popen(["git", "diff", "--cached", "--word-diff", "--color"], stdout=PIPE, stderr=PIPE)
stdout, stderr = process.communicate()

if stderr:
  print(stderr.decode())
  exit(1)

stdout = stdout.decode()
added = re.findall(r"\x1b\[[0-9;]*m\{\+([^\x1b]*)\+\}\x1b", stdout)
deleted = re.findall(r"\x1b\[[0-9;]*m\[-([^\x1b]*)-\]\x1b", stdout)
num_added = sum(len(a) for a in added)
num_deleted = sum(len(d) for d in deleted)
print(num_added - num_deleted)

Save it in a file and call it from within the git repository.