9

I want to count the numbers of hard tab characters in my documents in unix shell.

How can I do it?

I tried something like

grep -c \t foo

but it gives counts of t in file foo.

ravi
  • 3,304
  • 6
  • 25
  • 27
  • 2
    Do you want to count the number of tab chars or the number of lines containing tab chars? In the example you gave, if `\t` had worked, you'll get the latter (*number of lines containing tabs*). – Shawn Chin Jun 14 '12 at 14:43

6 Answers6

15

Use tr to discard everything except tabs, and then count:

< input-file tr -dc \\t | wc -c
William Pursell
  • 204,365
  • 48
  • 270
  • 300
  • Wonder who else is here looking to answer this https://cmdchallenge.com/#/find_tabs_in_a_file :) – BenB Feb 04 '17 at 01:27
11

Bash uses a $'...' notation for specifying special characters:

grep -c $'\t' foo
chepner
  • 497,756
  • 71
  • 530
  • 681
4

Use a perl regex (-P option) to grep tab characters.

So, to count the number of tab characters in a file:

grep -o -P '\t' foo | wc -l
dogbane
  • 266,786
  • 75
  • 396
  • 414
3

You can insert a literal TAB character between the quotes with Ctrl+V+TAB.

In general you can insert any character at all by prefixing it with Ctrl+V; even control characters such as Enter or Ctrl+C that the shell would otherwise interpret.

Joni
  • 108,737
  • 14
  • 143
  • 193
1

You can use awk in a tricky way: use tab as the record separator, then the number of tab characters is the total number of records minus 1:

ntabs=$(awk 'BEGIN {RS="\t"} END {print NR-1}' foo)
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
0

My first thought was to use sed to strip out all non-tab characters, then use wc to count the number of characters left.

< foo.txt sed 's/[^\t]//g' | wc -c

However, this also counts newlines, which sed won't touch because it is line-based. So, let's use tr to translate all the newlines into spaces, so it is one line for sed.

< foo.txt tr '\n' ' ' | sed 's/[^\t]//g' | wc -c

Depending on your shell and implementation of sed, you may have to use a literal tab instead of \t, however, with Bash and GNU sed, the above works.

LukeShu
  • 75
  • 1
  • 7