Count occurrences of a char in plain text file

Question

Is there any way under linux/terminal to count, how many times the char f occurs in a plain text file?

Technically this could be considered a sh/bash/etc. programming question, so I think it has validity in either place. — Rob Hruska, Oct 21 '09 at 21:51
@Rob Hruska: yes, I also think is bash programming... @abrashka: the answer for your first and second question is "NO"! — cupakob, Oct 22 '09 at 07:33

Cascabel · Accepted Answer · 2009-10-21T21:45:44.897

188

How about this:

fgrep -o f <file> | wc -l

Note: Besides much easier to remember/duplicate and customize, this is about three times (sorry, edit! botched the first test) faster than Vereb's answer.

edited Oct 21 '09 at 21:45

answered Oct 21 '09 at 21:37

Cascabel

479,068
72
370
318

This one doesn't work if you need to count `\r` or `\n` characters; the `tr -cd f` answer does work for that. – bjnord Oct 05 '13 at 00:08
3

To count several characters, e.g. `a`, `b` and `c`, use `egrep` : `egrep -o 'a|b|c' | wc -l`. – Skippy le Grand Gourou Apr 03 '17 at 13:29
Also, beware to NOT use `wc -c` as in the `tr` answer : since `grep` outputs line by line, `wc` would count end-of-lines as characters (hence doubling the number of characters). – Skippy le Grand Gourou Apr 03 '17 at 13:34
@bjnord Ok for `\r`, but to count `\n` why not just use `wc -l` ? – Skippy le Grand Gourou Apr 03 '17 at 13:35
Warning: `fgrep` is obsolescent; use `grep -F`. e.g. `grep -oF f | wc -l` – Qumber Nov 19 '22 at 09:49

score 75 · Answer 2 · edited Jan 29 '13 at 10:05

75

even faster:

tr -cd f < file | wc -c

Time for this command with a file with 4.9 MB and 1100000 occurences of the searched character:

real   0m0.089s
user   0m0.057s
sys    0m0.027s

Time for Vereb answer with echo, cat, tr and bc for the same file:

real   0m0.168s
user   0m0.059s
sys    0m0.115s

Time for Rob Hruska answer with tr, sed and wc for the same file:

real   0m0.465s
user   0m0.411s
sys    0m0.080s

Time for Jefromi answer with fgrep and wc for the same file:

real   0m0.522s
user   0m0.477s
sys    0m0.023s

edited Jan 29 '13 at 10:05

erik

2,278
1
23
30

answered Jan 17 '13 at 00:33

user1985553

1,032
8
7

3

To count several characters, e.g. `a`, `b` and `c` : `tr -cd abc < file | wc -l`. – Skippy le Grand Gourou Apr 03 '17 at 13:26
1

are you sure? wasn't suppose to be `tr -cd abc < file | wc -c` instead – Mithun B May 09 '20 at 18:36

score 10 · Answer 3 · edited Jan 29 '13 at 09:20

10

echo $(cat <file>  | wc -c) - $(cat <file>  | tr -d 'A' | wc -c) | bc

where the A is the character

Time for this command with a file with 4.9 MB and 1100000 occurences of the searched character:

real   0m0.168s
user   0m0.059s
sys    0m0.115s

edited Jan 29 '13 at 09:20

Jens Erat

37,523
16
80
96

answered Oct 21 '09 at 21:05

Vereb

14,388
2
28
30

1

This gets about a third faster if you take out the unnecessary `cat` s, giving the filename as an argument to `wc` and `tr`. – Cascabel Oct 21 '09 at 21:49
1

If you realy want to optimize this reads the file just once: echo $(stat -c%s ) - $(cat | tr -d 'A' | wc -c) | bc – Vereb Oct 21 '09 at 22:01
@Vereb - tr only reads `stdin`, but that can be piped rather than `cat`ed: `tr -d 'A' < | wc ...` – dsz Nov 16 '15 at 04:28

score 8 · Answer 4 · edited Oct 17 '11 at 18:30

8

If all you need to do is count the number of lines containing your character, this will work:

grep -c 'f' myfile

However, it counts multiple occurrences of 'f' on the same line as a single match.

edited Oct 17 '11 at 18:30

Chris Betti

2,721
2
27
36

answered May 10 '10 at 23:43

Jongo the Gibbon

97
1
1

score 4 · Answer 5 · answered Oct 21 '09 at 21:19

tr -d '\n' < file | sed 's/A/A\n/g' | wc -l

Replacing the two occurrences of "A" with your character, and "file" with your input file.

tr -d '\n' < file: removes newlines
sed 's/A/A\n/g: adds a newline after every occurrence of "A"
wc -l: counts the number of lines

Example:

$ cat file
abcdefgabcdefgababababbbba


1234gabca

$ tr -d '\n' < file | sed 's/a/a\n/g' | wc -l
9

Count occurrences of a char in plain text file

5 Answers5

Linked