277

Is there a bash command which counts the number of files that match a pattern?

For example, I want to get the count of all files in a directory which match this pattern: log*

codeforester
  • 39,467
  • 16
  • 112
  • 140
hudi
  • 15,555
  • 47
  • 142
  • 246
  • 2
    We keep getting new answers which don't work for nontrivial file names. If you want to post a new answer, please read http://mywiki.wooledge.org/ParsingLs and https://mywiki.wooledge.org/BashFAQ/020 first – tripleee Apr 23 '22 at 07:28

16 Answers16

367

This simple one-liner should work in any shell, not just bash:

ls -1q log* | wc -l

ls -1q will give you one line per file, even if they contain whitespace or special characters such as newlines.

The output is piped to wc -l, which counts the number of lines.

Daniel
  • 4,797
  • 2
  • 23
  • 30
  • you just have * on the opposite side but this works thx a lot – hudi Jul 03 '12 at 08:45
  • 11
    I would not use `-l`, since that requires `stat(2)` on each file and for the purposes of counting adds nothing. – camh Jul 03 '12 at 08:52
  • 15
    I would not use `ls`, since it creates a child process. `log*` is expanded by the shell, not `ls`, so a simple `echo` would do. – cdarke Jul 03 '12 at 09:24
  • 3
    Except an echo will not work if you have file names with spaces or special characters. – Daniel Oct 07 '14 at 06:38
  • I also get "ls: cannot access log*: No such file or directory" if no files match. wc does not count this error as a line since it is output via stderr rather than stdout. Also, the -l (for long) can be replaced by -1f (for files-only, one per line). Try "2>/dev/null ls -1f log* | wc -l" if cleanliness is important. – mightypile Apr 07 '15 at 20:01
  • 1
    Actually "ls -l" doesn't print files one per line when the output is piped. `touch abc$'\n'def; ls abc*def | wc -l` prints 2 even though if you don't pipe it, it'll show `abc?def`. `ls -1f` has the same problem. – mogsie Aug 22 '15 at 18:32
  • 1
    Looks like you're right @mogsie, funny, I'm sure I tested that back in 2012. Updated to defer to Mat's answer if fines with newlines need to be catered for. – Daniel Dec 23 '15 at 06:32
  • @camh is right, `ls -l` is an overkill. `ls -1` is enough. – Walter Tross Jan 24 '17 at 21:21
  • 4
    @WalterTross That's true (not that efficiency was a requirement of the original question). I also just found that -q takes care of files with newlines, even when the output is not the terminal. And these flags are supported by all the platforms and shells I've tested on. Updating the answer, thanks to you and camh for the input! – Daniel Jan 25 '17 at 05:38
  • wc -l count lines. I am sure there must be a better solution which does not read all the filnames. – neoexpert Mar 09 '18 at 17:04
  • 6
    If there's a directory called `logs` in the directory in question, then the _contents_ of that logs directory will be counted too. This is probably not intentional. – mogsie Aug 06 '18 at 11:55
  • If you are in an interactive bash shell, I'd recommend using `command ls -1q` to bypass any function or alias wrappers that might set undesirable switches (*lots* of default `.bashrc`/`.bash_profile` configurations wrap `ls` in an alias with default switches). Adding `-U` or `-f` would make it faster on large directories, but no more correct (as long as you used `command` to avoid aliases; `-f` might protect you from some aliases if you forget to use `command`). – ShadowRanger Aug 10 '19 at 01:38
  • This will work with any POSIX shell and does not parse the output of `ls`, which is a slippery slope bad practice: `set -- log*; echo $#` – Léa Gris Oct 21 '20 at 14:15
  • I am using bash (with `oh-my-bash` too) and for a directory containing 3 subdirectories `log1`, `log2`, and `log3`, the command `ls -1q log* | wc -l` is returning 5, not 3. This seems to be because `ls` is listing the files separated by blank lines. Any ideas how to correct this? – Aerinmund Fagelson Sep 29 '21 at 13:32
  • 2
    @AerinmundFagelson In your case you have directories matching the `log*` pattern (this is different to the OP's case), so their contents will also be included. You can add the `-d` argument to prevent `ls` from displaying directory contents, e.g.: `ls -1qd` – Daniel Sep 30 '21 at 10:30
  • Thank-you @Daniel adding the `-d` argument did the trick :) – Aerinmund Fagelson Oct 07 '21 at 12:26
  • 1
    I think one needs to redirect the stderr to /dev/null so that the result will be accurate when the directory is empty – humility Apr 22 '22 at 18:48
  • This answer does not work if there is no `log*` file. adding `2>/dev/null` and using `wc -c` instead would be better – Lunartist Jul 25 '22 at 02:25
  • For large directories this approach would never end up with a response: `ls` internally does implicit sorting by file name, with all the disvantages that this can bring. I think that this cannot be the accepted answer, as it can potentially hang up waiting for the OS to return the whole list and then sorting, before counting a very long list full of words full of useless characters, looking for '\n's. See [@Will-Vousden's answer](https://stackoverflow.com/a/11307341/5152432), which works really faster and with far less memory – Marco Carlo Moriggi Jul 21 '23 at 12:17
85

Lots of answers here, but some don't take into account

  • file names with spaces, newlines, or control characters in them
  • file names that start with hyphens (imagine a file called -l)
  • hidden files, that start with a dot (if the glob was *.log instead of log*
  • directories that match the glob (e.g. a directory called logs that matches log*)
  • empty directories (i.e. the result is 0)
  • extremely large directories (listing them all could exhaust memory)

Here's a solution that handles all of them:

ls 2>/dev/null -Ubad1 -- log* | wc -l

Explanation:

  • -U causes ls to not sort the entries, meaning it doesn't need to load the entire directory listing in memory
  • -b prints C-style escapes for nongraphic characters, crucially causing newlines to be printed as \n.
  • -a prints out all files, even hidden files (not strictly needed when the glob log* implies no hidden files)
  • -d prints out directories without attempting to list the contents of the directory, which is what ls normally would do
  • -1 makes sure that it's on one column (ls does this automatically when writing to a pipe, so it's not strictly necessary)
  • 2>/dev/null redirects stderr so that if there are 0 log files, ignore the error message. (Note that shopt -s nullglob would cause ls to list the entire working directory instead.)
  • wc -l consumes the directory listing as it's being generated, so the output of ls is never in memory at any point in time.
  • -- File names are separated from the command using -- so as not to be understood as arguments to ls (in case log* is removed)

The shell will expand log* to the full list of files, which may exhaust memory if it's a lot of files, so then running it through grep is be better:

ls -Uba1 | grep ^log | wc -l

This last one handles extremely large directories of files without using a lot of memory (albeit it does use a subshell). The -d is no longer necessary, because it's only listing the contents of the current directory.

mogsie
  • 4,021
  • 26
  • 26
  • 9
    I\`m almost 5 years late but still I\`d like to point out that `grep` can count lines as well, rendering `wc -l` unnecessary. The resulting command would look like this: `ls -Uba1 | grep -c ^log`. Nevertheless, the original answer is extremely helpful. – hidefromkgb Jul 30 '20 at 01:13
72

For a recursive search:

find . -type f -name '*.log' -printf x | wc -c

wc -c will count the number of characters in the output of find, while -printf x tells find to print a single x for each result. This avoids any problems with files with odd names which contain newlines etc.

For a non-recursive search, do this:

find . -maxdepth 1 -type f -name '*.log' -printf x | wc -c
tripleee
  • 175,061
  • 34
  • 275
  • 318
Will Vousden
  • 32,488
  • 9
  • 84
  • 95
  • FYI if you simply leave out `-name '*.log'` then it will count all files, which is what I needed for my use case. Also the -maxdepth flag is extremely useful, thanks! – starmandeluxe Aug 31 '18 at 05:27
  • 1
    This does great in avoiding problems with any type of special characters! – bballdave025 May 17 '22 at 00:27
  • I benchmarked this one against the other approaches, and _(on my machine)_ **this one is the fastest** by a significant margin—the runner-up was the less-safe default `find` output piped into `wc -l`, and that was ~2x slower than your solution. So Kudos! `^v^` – Pyr3z Mar 23 '23 at 13:53
  • probably the most correct flag to use with `wc` is `-m`, which counts characters instead of bytes, but yes, I think that your one is the most effective solution – Marco Carlo Moriggi Jul 21 '23 at 12:03
69

You can do this safely (i.e. won't be bugged by files with spaces or \n in their name) with bash:

$ shopt -s nullglob
$ logfiles=(*.log)
$ echo ${#logfiles[@]}

You need to enable nullglob so that you don't get the literal *.log in the $logfiles array if no files match. (See How to "undo" a 'set -x'? for examples of how to safely reset it.)

Mat
  • 202,337
  • 40
  • 393
  • 406
  • 4
    Perhaps explicitly point out that this is a Bash-*only* answer, especially for new visitors who are not yet entirely up to speed on the [Difference between sh and bash](/questions/5725296/difference-between-sh-and-bash) – tripleee Sep 01 '18 at 09:04
  • Also, the final `shopt -u nullglob` should be skipped if `nullglob` wasn't unset then you started. – tripleee Sep 01 '18 at 09:05
  • Note: Replacing `*.log` with just `*` will count directories. If the files you wish to enumerate have the traditional naming convention of `name.extension`, use `*.*`. – AlainD Dec 06 '19 at 14:13
  • 1
    An explanation of what the [@] means would be helpful – Andy Preston Oct 14 '21 at 10:36
11

The accepted answer for this question is wrong, but I have low rep so can't add a comment to it.

The correct answer to this question is given by Mat:

shopt -s nullglob
logfiles=(*.log)
echo ${#logfiles[@]}

The problem with the accepted answer is that wc -l counts the number of newline characters, and counts them even if they print to the terminal as '?' in the output of 'ls -l'. This means that the accepted answer FAILS when a filename contains a newline character. I have tested the suggested command:

ls -l log* | wc -l

and it erroneously reports a value of 2 even if there is only 1 file matching the pattern whose name happens to contain a newline character. For example:

touch log$'\n'def
ls log* -l | wc -l
Dan Yard
  • 147
  • 1
  • 5
10

An important comment

(not enough reputation to comment)

This is BUGGY:

ls -1q some_pattern | wc -l

If shopt -s nullglob happens to be set, it prints the number of ALL regular files, not just the ones with the pattern (tested on CentOS-8 and Cygwin). Who knows what other meaningless bugs does ls have?

This is CORRECT and much faster:

shopt -s nullglob; files=(some_pattern); echo ${#files[@]};

It does the expected job.


And the running times differ.
The 1st: 0.006 on CentOS, and 0.083 on Cygwin (in case it is used with care).
The 2nd: 0.000 on CentOS, and 0.003 on Cygwin.
Small Boy
  • 147
  • 1
  • 8
8

If you have a lot of files and you don't want to use the elegant shopt -s nullglob and bash array solution, you can use find and so on as long as you don't print out the file name (which might contain newlines).

find -maxdepth 1 -name "log*" -not -name ".*" -printf '%i\n' | wc -l

This will find all files that match log* and that don't start with .* — The "not name .*" is redunant, but it's important to note that the default for "ls" is to not show dot-files, but the default for find is to include them.

This is a correct answer, and handles any type of file name you can throw at it, because the file name is never passed around between commands.

But, the shopt nullglob answer is the best answer!

mogsie
  • 4,021
  • 26
  • 26
  • You probably should update your original answer instead of answering again. – qodeninja Aug 01 '17 at 19:39
  • I think using `find` vs using `ls` are two different ways of solving the problem. `find` is not always present on a machine, but `ls` usually is, – mogsie Aug 18 '17 at 09:32
  • 3
    But then a box of lard which doesn't have `find` probably doesn't have all those fancy options for `ls` either. – tripleee Sep 01 '18 at 09:12
  • 1
    Note also how this extends to a whole directory tree if you take out the `-maxdepth 1` – tripleee Oct 29 '18 at 05:33
  • 2
    Note this solution will count files inside hidden directories in its count.`find` does this by default. This can create confusion if one doesn't realize there's a hidden child folder, and may make it advantageous to use `ls` in some circumstances, which does not report hidden files by default. – MrPotatoHead Feb 12 '19 at 14:12
7

Here is my one liner for this.

 file_count=$( shopt -s nullglob ; set -- $directory_to_search_inside/* ; echo $#)
z atef
  • 7,138
  • 3
  • 55
  • 50
  • It took me some googling to understand, but this is nice! So `set -- ` is not doing anything except getting us ready for `$#`, that _stores the number of command-line arguments that were passed to the shell program_ – xverges Nov 19 '19 at 08:47
  • @xverges Yes, "shopt -s nullglob" is for not counting hidden files(.files). set -- is for storing/setting number of positional parameters(num of files, in this case). and #$ for displaying the number of positional parameters(files count). – z atef Nov 22 '19 at 23:11
  • A correct syntax and POSIX compliant version of your implementation, without even a sub-shell spawned: `set -- "$directory_to_search_inside/"*; [ $# -eq 1 -a ! -e "$1" ] && shift; file_count=$#` – Léa Gris Oct 21 '20 at 15:34
4

You can use the -R option to find the files along with those inside the recursive directories

ls -R | wc -l // to find all the files

ls -R | grep log | wc -l // to find the files which contains the word log

you can use patterns on the grep

Moh .S
  • 1,920
  • 19
  • 19
  • 1
    is not recommended to do it this way. Explanation: github.com/koalaman/shellcheck/wiki/SC2010 – mana Nov 08 '21 at 11:02
4

You can define such a command easily, using a shell function. This method does not require any external program and does not spawn any child process. It does not attempt hazardous ls parsing and handles “special” characters (whitespaces, newlines, backslashes and so on) just fine. It only relies on the file name expansion mechanism provided by the shell. It is compatible with at least sh, bash and zsh.

The line below defines a function called count which prints the number of arguments with which it has been called.

count() { echo $#; }

Simply call it with the desired pattern:

count log*

For the result to be correct when the globbing pattern has no match, the shell option nullglob (or failglob — which is the default behavior on zsh) must be set at the time expansion happens. It can be set like this:

shopt -s nullglob    # for sh / bash
setopt nullglob      # for zsh

Depending on what you want to count, you might also be interested in the shell option dotglob.

Unfortunately, with bash at least, it is not easy to set these options locally. If you don’t want to set them globally, the most straightforward solution is to use the function in this more convoluted manner:

( shopt -s nullglob ; shopt -u failglob ; count log* )

If you want to recover the lightweight syntax count log*, or if you really want to avoid spawning a subshell, you may hack something along the lines of:

# sh / bash:
# the alias is expanded before the globbing pattern, so we
# can set required options before the globbing gets expanded,
# and restore them afterwards.
count() {
    eval "$_count_saved_shopts"
    unset _count_saved_shopts
    echo $#
}
alias count='
    _count_saved_shopts="$(shopt -p nullglob failglob)"
    shopt -s nullglob
    shopt -u failglob
    count'

As a bonus, this function is of a more general use. For instance:

count a* b*          # count files which match either a* or b*
count $(jobs -ps)    # count stopped jobs (sh / bash)

By turning the function into a script file (or an equivalent C program), callable from the PATH, it can also be composed with programs such as find and xargs:

find "$FIND_OPTIONS" -exec count {} \+    # count results of a search
Maëlan
  • 3,586
  • 1
  • 15
  • 35
2

I've given this answer a lot of thought, especially given the don't-parse-ls stuff. At first, I tried

<WARNING! DID NOT WORK>
du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | awk '{sum+=int($1)}END{print sum}'
</WARNING! DID NOT WORK>

which worked if there was only a filename like

touch $'w\nlf.aa'

but failed if I made a filename like this

touch $'firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg'

I finally came up with what I'm putting below. Note I was trying to get a count of all files in the directory (not including any subdirectories). I think it, along with the answers by @Mat and @Dan_Yard , as well as having at least most of the requirements set out by @mogsie (I'm not sure about memory.) I think the answer by @mogsie is correct, but I always try to stay away from parsing ls unless it's an extremely specific situation.

awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'

More readably:

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'

This is doing a find specifically for files, delimiting the output with a null character (to avoid problems with spaces and linefeeds), then counting the number of null characters. The number of files will be one less than the number of null characters, since there will be a null character at the end.

To answer the OP's question, there are two cases to consider

1) Non-recursive search:

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -name "log*" -print0) | \
    awk '{sum+=$1}END{print sum}'

2) Recursive search. Note that what's inside the -name parameter might need to be changed for slightly different behavior (hidden files, etc.).

awk -F"\0" '{print NF-1}' < \
  <(find . -type f -name "log*" -print0) | \
    awk '{sum+=$1}END{print sum}'

If anyone would like to comment on how these answers compare to those I've mentioned in this answer, please do.


Note, I got to this thought process while getting this answer.

bballdave025
  • 1,347
  • 1
  • 15
  • 28
  • 2
    I don't think Awk can reliably be expected to cope correctly with null bytes either, though. – tripleee Apr 23 '22 at 07:22
  • I think I understand what you are trying to say, @tripleee, but I want to be sure. Are you saying that a filename containing a null character will cause problems when passed to `awk -F'\0'`? That would absolutely be the case, and I'm not sure how to deal with it. – bballdave025 May 17 '22 at 00:17
  • However, with more reflection, a filename can't have a null character, cf. https://stackoverflow.com/questions/54205087/how-can-i-create-a-file-with-null-bytes-in-the-filename . Could you further explain what you mean, @tripleee ? If this answer doesn't work for certain nontrivial filenames, I don't want to keep it here. – bballdave025 May 17 '22 at 00:42
  • The `-print0` specifically adds a null byte between file names as the only separator which absolutely cannot be part of the filename itself. Some Awk versions can deal with that, but others can't. For more background, perhaps see also https://mywiki.wooledge.org/BashFAQ/020 – tripleee May 17 '22 at 05:25
1

This can be done with standard POSIX shell grammar.

Here is a simple count_entries function:

#!/usr/bin/env sh

count_entries()
{
  # Emulating Bash nullglob 
  # If argument 1 is not an existing entry
  if [ ! -e "$1" ]
    # argument is a returned pattern
    # then shift it out
    then shift
  fi
  echo $#
}

for a compact definition:

count_entries(){ [ ! -e "$1" ]&&shift;echo $#;}

Featured POSIX compatible file counter by type:

#!/usr/bin/env sh

count_files()
# Count the file arguments matching the file operator
# Synopsys:
# count_files operator FILE [...]
# Arguments:
# $1: The file operator
#   Allowed values:
#   -a FILE    True if file exists.
#   -b FILE    True if file is block special.
#   -c FILE    True if file is character special.
#   -d FILE    True if file is a directory.
#   -e FILE    True if file exists.
#   -f FILE    True if file exists and is a regular file.
#   -g FILE    True if file is set-group-id.
#   -h FILE    True if file is a symbolic link.
#   -L FILE    True if file is a symbolic link.
#   -k FILE    True if file has its `sticky' bit set.
#   -p FILE    True if file is a named pipe.
#   -r FILE    True if file is readable by you.
#   -s FILE    True if file exists and is not empty.
#   -S FILE    True if file is a socket.
#   -t FD      True if FD is opened on a terminal.
#   -u FILE    True if the file is set-user-id.
#   -w FILE    True if the file is writable by you.
#   -x FILE    True if the file is executable by you.
#   -O FILE    True if the file is effectively owned by you.
#   -G FILE    True if the file is effectively owned by your group.
#   -N FILE    True if the file has been modified since it was last read.
# $@: The files arguments
# Output:
#   The number of matching files
# Return:
#   1: Unknown file operator
{
  operator=$1
  shift
  case $operator in
    -[abcdefghLkprsStuwxOGN])
      for arg; do
        # If file is not of required type
        if ! test "$operator" "$arg"; then
          # Shift it out
          shift
        fi
      done
      echo $#
      ;;
    *)
      printf 'Invalid file operator: %s\n' "$operator" >&2
      return 1
      ;;
  esac
}

count_files "$@"

Example usages:

count_files -f log*.txt
count_files -d datadir*

Alternate count non-directory entries without a loop:

#!/bin/sh

# Creates strings of as many dots as expanded arguments

# dotted string for entries matching star pattern
star=$(printf '%.0s.' ./*)
# dotted string for entries matching star slash pattern (directories)
star_dir=$(printf '%.0s.' ./*/)
# dotted string for entries matching dot star pattern
dot_star=$(printf '%.0s.' ./.*)
# dotted string for entries matching dot star slash pattern (directories)
dot_star_dir=$(printf '%.0s.' ./.*/)

# Print pattern matches count excluding directories matches
printf 'Files count: %d\n' $((
  ${#star} - ${#star_dir} +
  ${#dot_star} - ${#dot_star_dir}
))
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
0

Here is a generic Bash function you can use in your scripts.

    # @see https://stackoverflow.com/a/11307382/430062
    function countFiles {
        shopt -s nullglob
        logfiles=($1)
        echo ${#logfiles[@]}
    }

    FILES_COUNT=$(countFiles "$file-*")
Theodore R. Smith
  • 21,848
  • 12
  • 65
  • 91
  • 1
    See: [`count_entries(){ [ $# -eq 1 ]&&[ ! -e "$1" ]&&shift;echo $#;} `](https://stackoverflow.com/a/64466042/7939871). It does not need `nullglob`, so it is 100% POSIX grammar. – Léa Gris Oct 21 '20 at 15:19
-1
ls -1 log* | wc -l

Which means list one file per line and then pipe it to word count command with parameter switching to count lines.

nudzo
  • 17,166
  • 2
  • 19
  • 19
  • "-1" option is not necessary when piping the ls output. But you might want to hide ls error message if no file matches the pattern. I suggest " ls log* 2>/dev/null | wc -l ". – JohnMudd Jan 16 '14 at 14:37
  • The discussion under [Daniel's answer](/a/11307364/874188) is relevant here too. This works fine when you don't have matching directories or file names with newlines, but a good answer should at least point out these boundary conditions, and a great answer should not have them. Many bugs are because somebody copy/pasted code they didn't understand; so pointing out the flaws at least helps them understand what to watch out for. (Granted, many more bugs happen because they ignored the caveats and then things changed after they thought the code was probably good enough for their purpose.) – tripleee Sep 01 '18 at 09:07
-1

Here's what I always do:

ls log* | awk 'END{print NR}'

  • `awk 'END{print NR}'` should be equivalent to `wc -l`. – musiphil Feb 28 '20 at 18:50
  • We keep getting new answers which don't work for nontrivial file names. See http://mywiki.wooledge.org/ParsingLs and https://mywiki.wooledge.org/BashFAQ/020 – tripleee Apr 23 '22 at 07:26
-3

To count everything just pipe ls to word count line:

ls | wc -l

To count with pattern, pipe to grep first:

ls | grep log | wc -l
jturi
  • 1,615
  • 15
  • 11
  • 1
    is not recommended to do it this way. Explanation: https://github.com/koalaman/shellcheck/wiki/SC2010 – mana Nov 08 '21 at 11:00