0

There are at least two ways of doing the same thing :) ls -l *ABC* and ls -l | grep ABC

But which one is more efficient? Are there others, more efficient?

Cross2004
  • 931
  • 1
  • 7
  • 12
  • 2
    Of course `ls -l *ABC*` – anubhava Nov 02 '17 at 15:23
  • 3
    In almost all cases a single process will be far more efficient than piping. – 123 Nov 02 '17 at 15:25
  • What's the use case? If you want to pass a list of files to a different program, or iterate over them in your shell script, you shouldn't use `ls` at all -- not just for performance reasons, but for *correctness* reasons as well. See [Why you shouldn't parse the output of `ls`](http://mywiki.wooledge.org/ParsingLs) – Charles Duffy Nov 02 '17 at 15:35
  • 1
    @CharlesDuffy sometimes people just want to list files... – 123 Nov 02 '17 at 15:42
  • 1
    @123, sure, but it's useful to distinguish whether that's the situation here. And frankly, if someone "just want[s] to list files", it's surprising that performance matters (unless it's a really huge directory, in which case the biggest gains available come by way of telling `ls` not to sort). – Charles Duffy Nov 02 '17 at 15:43
  • In almost all cases you should not care. Both are fast enough. Of course `ls -l *ABC*` is faster and better. BTW, you forgot to take into account the shell (doing the globbing). Does it count in your "efficiency" concern? The only practically meaningful case would be a huge directory (of million entries) which is uncommon .... – Basile Starynkevitch Nov 02 '17 at 16:30

2 Answers2

4

They do two subtly different things.

ls -l *ABC* lists all entries in the current directory whose names contain ABC (and don't start with .).

ls -l | grep ABC lists all entries in the current directory whose names don't start with . and then filters out all lines that don't contain ABC.

ls -l lists user and group names as well as file names, so if there happen to be files owned by a user or group whose name contains ABC, they'll be listed regardless of their names. Most group and user names don't contain uppercase letters, but that's not a firm requirement, and you'll want to do the same thing with other patterns like abc. If the pattern happens to be something contained in the word total, you'll match the first line of the output of ls -l.

More obscurely, file names can legally contain any characters other than / and the null character -- including newlines. Using such a name is a really bad idea, but such a file's name will be listed across two or more lines, and grep operates on lines.

The output of ls -l is intended to be human-readable. It's not really intended to be processed automatically.

ls -l *ABC* says what you mean more clearly and directly. Think about that before you consider performance. Unless your current directory is positively huge, any performance difference is likely to be swamped by the time it takes to print the output.

Having said all that, let's look at the likely performance issues.

In ls -l *ABC*, the *ABC* wildcard is handled by the shell; ls sees only a list of arguments. It requires the shell to scan the current directory and build a sorted list of file names matching the pattern. The ls command will then sort it again (and depending on your shell and locale settings, I'm not certain both sorts will yield the same order). Sorting might be a performance issue for very large directories. (Solution: Avoid making very large directories.) ls will be sorting fewer items than the shell does -- unless everything in the current directory matches *ABC*.

In ls -l | grep ABC, the ls command has to scan the current directory, sort it all, fetch metadata for everything, and then print it all, to have (probably) most of it filtered out by the separate grep process.

I don't know which is going to be faster. It likely depends on the contents of your current directory. But unless you're either working with huge directories or performing this operation many many times, the performance difference probably doesn't matter. If it does matter, measure it; that's the only way to know the difference in your environment.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
1

Glob vs Grep.

The way ls -l *ABC* works is that the wildcards match the regex and populate an array of the file names matching that regex. Once that is done, ls simply lists out the files in long -l format.

ls -l | grep ABC uses linux pipes. The way pipes work is that it connects the STD_OUT of the left command to the STD_IN of the right command. Thus ls -l first generates a list of all the file names in long format, and then pipe passes this list to grep, which filters out the list based on its matching regex. Now, suppose this list is of a million files, it is unnecessary to pass this whole list when glob could populate it for you.

Thus ls -l | grep ABC would be much slower than ls -l *ABC*

segfault
  • 504
  • 2
  • 6
  • 4
    It's important to distinguish that in the `*ABC*` case, the list of filenames is generated by the shell **before `ls` is even started**. This implies some limits not present in the `ls | grep` case -- the local platform's maximum command-line length is pertinent. – Charles Duffy Nov 02 '17 at 15:36
  • (Contrast with `printf '%s\0' *ABC* | xargs -0 ls -l`, which avoids that limit by splitting into multiple `ls` invocations if there are more names than just one can handle). – Charles Duffy Nov 02 '17 at 15:44