How do I count the number of rows and columns in a file using bash?

Question

Say I have a large file with many rows and many columns. I'd like to find out how many rows and columns I have using bash.

Sorry, I'm not very familiar with bash. In R, it would look something like dim(input), which would return two numbers, #rows and #cols. — Nick, Apr 23 '11 at 00:14
an actual input file might look like: "blah\tdata\tdata\tdata\tdata\nblah2\tdata\tdata\tdata\tdata\n" — Nick, Apr 23 '11 at 00:15
I was hoping there might be an elegant way to do this with some built-in function...perhaps something like wc? — Nick, Apr 23 '11 at 00:16

Erik · Accepted Answer · 2011-04-23T00:34:01.103

101

Columns: awk '{print NF}' file | sort -nu | tail -n 1

Use head -n 1 for lowest column count, tail -n 1 for highest column count.

Rows: cat file | wc -l or wc -l < file for the UUOC crowd.

edited Apr 23 '11 at 00:34

answered Apr 23 '11 at 00:15

Erik

88,732
13
198
189

@Tim: Typo, should be `<` obviously. This isn't how I'd do it, but it satisfies the UUOC crowd (`cat` enhances readability IMO, and I prefer it over the less readable pipes *especially* when answering newbie questions) – Erik Apr 23 '11 at 00:35
1

@Erik You can also do "< file wc -l" and put the redirection before the command for enhanced readability. (Although, in this case I'm not sure why you don't just do "wc -l file") – William Pursell Apr 23 '11 at 03:33
8

No need for the sort or the tail, just do it all in awk: awk '{if( NF > max ) max = NF} END {print max}' – William Pursell Apr 23 '11 at 03:36
1

How to specify a delimiter here, for example if the file is tab separated – Joy Apr 10 '18 at 13:46
@WilliamPursell in case you're still wondering this 8 years down the track, when using a pipe or stdin `wc` won't show a filename while `wc -l file` does. – trs Aug 19 '19 at 17:36
2

@Joy: `awk -F'\t' '{print NF}' file | sort -nu | tail -n 1` to use tab as delimiter – Johan Zicola Dec 03 '19 at 13:33
This is very wasteful for long files. You can just pipe head or tail into awk like this: `head -n 1 FILE | awk '{print NF}'` – Cornelius Roemer Jul 23 '21 at 13:16
@CorneliusRoemer Your approach assumes all rows have the same column count. The question indicates rows have differing column counts. – Erik Jul 27 '21 at 12:12
I don't see where the question specifies that column counts vary by row. But sure, your solution is more general, but also slower. – Cornelius Roemer Jul 27 '21 at 20:28

GGibson · Answer 2 · 2021-06-23T15:17:10.463

15

Alternatively to count columns, count the separators between columns. I find this to be a good balance of brevity and ease to remember. Of course, this won't work if your data include the column separator.

head -n1 myfile.txt | grep -o " " | wc -l

Uses head -n1 to grab the first line of the file. Uses grep -o to to count all the spaces, and output each space found on a new line. Uses wc -l to count the number of lines.

EDIT: As Gaurav Tuli points out below, I forgot to mention you have to mentally add 1 to the result, or otherwise script this math.

edited Jun 23 '21 at 15:17

answered Jul 10 '15 at 19:46

GGibson

354
2
9

2

column count in a CSV will be `head -n1 myfile.txt | grep -o "," | wc -l` + 1 because grep is counting the number of commas (or any other column separator) but the number of columns would be 1 more than that. – Gaurav Tuli Jun 14 '21 at 15:43

score 12 · Answer 3 · answered Aug 18 '14 at 14:46

12

If your file is big but you are certain that the number of columns remains the same for each row (and you have no heading) use:

head -n 1 FILE | awk '{print NF}'

to find the number of columns, where FILE is your file name.

To find the number of lines 'wc -l FILE' will work.

answered Aug 18 '14 at 14:46

Fabio

161
1
4

5

Or just `awk '{print NF; exit}'`. – arekolek Aug 08 '16 at 18:38

FatihSarigol · Answer 4 · 2017-07-20T02:12:05.057

Little twist to kirill_igum's answer, and you can easily count the number of columns of any certain row you want, which was why I've come to this question, even though the question is asking for the whole file. (Though if your file has same columns in each line this also still works of course):

head -2 file |tail -1 |tr '\t' '\n' |wc -l

Gives the number of columns of row 2. Replace 2 with 55 for example to get it for row 55.

-bash-4.2$ cat file
1       2       3
1       2       3       4
1       2
1       2       3       4       5

-bash-4.2$ head -1 file |tail -1 |tr '\t' '\n' |wc -l
3
-bash-4.2$ head -4 file |tail -1 |tr '\t' '\n' |wc -l
5

Code above works if your file is separated by tabs, as we define it to "tr". If your file has another separator, say commas, you can still count your "columns" using the same trick by simply changing the separator character "t" to ",":

-bash-4.2$ cat csvfile
1,2,3,4
1,2
1,2,3,4,5
-bash-4.2$ head -2 csvfile |tail -1 |tr '\,' '\n' |wc -l
2

Sanaulla Haq · Answer 5 · 2020-08-20T14:05:02.077

For rows you can simply use wc -l file

-l stands for total line

for columns uou can simply use head -1 file | tr ";" "\n" | wc -l

Explanation
head -1 file
Grabbing the first line of your file, which should be the headers, and sending to it to the next cmd through the pipe
| tr ";" "\n"

tr stands for translate.
It will translate all ; characters into a newline character.
In this example ; is your delimiter.

Then it sends data to next command.

wc -l
Counts the total number of lines.

score 3 · Answer 6 · answered Oct 03 '16 at 13:58

3

If counting number of columns in the first is enough, try the following:

awk -F'\t' '{print NF; exit}' myBigFile.tsv

where \t is column delimiter.

answered Oct 03 '16 at 13:58

JelenaČuklina

3,574
2
22
35

score 3 · Answer 7 · edited Dec 14 '18 at 10:22

3

awk 'BEGIN{FS=","}END{print "COLUMN NO: "NF " ROWS NO: "NR}' file

You can use any delimiter as field separator and can find numbers of ROWS and columns

edited Dec 14 '18 at 10:22

CinCout

9,486
12
49
67

answered Dec 14 '18 at 10:03

Wasif

51
2

how to is it possible to put this as an ```alias```? I tried different ways but it gives errors. Possibly related to the ```' "```. – Apex Jun 10 '22 at 15:14

score 3 · Answer 8 · answered Apr 23 '11 at 00:28

3

You can use bash. Note for very large files in terms of GB, use awk/wc. However it should still be manageable in performance for files with a few MB.

declare -i count=0
while read
do
    ((count++))
done < file    
echo "line count: $count"

answered Apr 23 '11 at 00:28

bash-o-logist

6,665
1
17
14

score 2 · Answer 9 · answered Nov 18 '15 at 22:04

2

head -1 file.tsv |head -1 train.tsv |tr '\t' '\n' |wc -l

take the first line, change tabs (or you can use ',' instead of '\t' for commas), count the number of lines.

answered Nov 18 '15 at 22:04

kirill_igum

3,953
5
47
73

score 2 · Answer 10 · answered Apr 23 '11 at 00:17

2

Simple row count is $(wc -l "$file"). Use $(wc -lL "$file") to show both the number of lines and the number of characters in the longest line.

answered Apr 23 '11 at 00:17

Tim Sylvester

22,897
2
80
94

True. Silly of me to assume this was obvious: `wc -l file |cut -f 1`. – Tim Sylvester Apr 23 '11 at 00:26
@Tim Sylvester: You know UUOC is about the wasted process, right? I'm tempted to pass it back to you for that `cut` ;) – Erik Apr 23 '11 at 00:28
How is it wasted if you want the file name removed from the output? Is there a way I don't know of to make `wc` not print the filename? – Tim Sylvester Apr 23 '11 at 00:31
Ah, using stdin doesn't show a filename. Now I feel dumb. – Tim Sylvester Apr 23 '11 at 00:35

score 1 · Answer 11 · answered Oct 30 '15 at 22:41

Perl solution:

perl -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file

If your input file is comma-separated:

perl -F, -ane '$maxc = $#F if $#F > $maxc; END{$maxc++; print "max columns: $maxc\nrows: $.\n"}' file

output:

max columns: 5
rows: 2

-a autosplits input line to @F array
$#F is the number of columns -1
-F, field separator of , instead of whitespace
$. is the line number (number of rows)

score 0 · Answer 12 · answered Jan 22 '17 at 22:37

0

A very simple way to count the columns of the first line in pure bash (no awk, perl, or other languages):

read -r line < $input_file
ncols=`echo $line | wc -w`

This will work if your data are formatted appropriately.

answered Jan 22 '17 at 22:37

EAdrianH

31
4

score 0 · Answer 13 · answered Apr 06 '17 at 18:38

Following code will do the job and will allow you to specify field delimiter. This is especially useful for files containing more than 20k lines.

awk 'BEGIN { 
  FS="|"; 
  min=10000; 
}
{ 
  if( NF > max ) max = NF; 
  if( NF < min ) min = NF;
} 
END { 
  print "Max=" max; 
  print "Min=" min; 
} ' myPipeDelimitedFile.dat

How do I count the number of rows and columns in a file using bash?

13 Answers13

Linked

Related