I want to count the numbers of hard tab characters
in my documents in unix shell.
How can I do it?
I tried something like
grep -c \t foo
but it gives counts of t in file foo.
I want to count the numbers of hard tab characters
in my documents in unix shell.
How can I do it?
I tried something like
grep -c \t foo
but it gives counts of t in file foo.
Use tr to discard everything except tabs, and then count:
< input-file tr -dc \\t | wc -c
Bash uses a $'...'
notation for specifying special characters:
grep -c $'\t' foo
Use a perl regex (-P
option) to grep tab characters.
So, to count the number of tab characters in a file:
grep -o -P '\t' foo | wc -l
You can insert a literal TAB character between the quotes with Ctrl+V+TAB.
In general you can insert any character at all by prefixing it with Ctrl+V; even control characters such as Enter or Ctrl+C that the shell would otherwise interpret.
You can use awk in a tricky way: use tab as the record separator, then the number of tab characters is the total number of records minus 1:
ntabs=$(awk 'BEGIN {RS="\t"} END {print NR-1}' foo)
My first thought was to use sed
to strip out all non-tab characters, then use wc
to count the number of characters left.
< foo.txt sed 's/[^\t]//g' | wc -c
However, this also counts newlines, which sed
won't touch because it is line-based. So, let's use tr
to translate all the newlines into spaces, so it is one line for sed
.
< foo.txt tr '\n' ' ' | sed 's/[^\t]//g' | wc -c
Depending on your shell and implementation of sed
, you may have to use a literal tab instead of \t
, however, with Bash and GNU sed
, the above works.