10

What is the fastest way to test if a directory is empty?

Of course I can check the length of

list.files(path, all.files = TRUE, include.dirs = TRUE, no.. = TRUE)

but this requires enumerating the entire contents of the directory which I'd rather avoid.

EDIT: I'm looking for portable solutions.

EDIT^2: Some timings for a huge directory (run this in a directory that's initially empty, it will create 100000 empty files):

system.time(file.create(as.character(0:99999)))
#    user  system elapsed 
#   0.720  12.223  14.948 
system.time(length(dir()))
#    user  system elapsed 
#   2.419   0.600   3.167 
system.time(system("ls | head -n 1"))
# 0
#   user  system elapsed 
#  0.788   0.495   1.312 
system.time(system("ls -f | head -n 3"))
# .
# ..
# 99064
#    user  system elapsed 
#   0.002   0.015   0.019 

The -f switch is crucial for ls, it will avoid the sorting that will take place otherwise.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
krlmlr
  • 25,056
  • 14
  • 120
  • 217
  • I just removed my comments (on linux only solutions) but You're right portability is a nice feature to have. – dickoa Feb 05 '14 at 12:28
  • 1
    This question http://stackoverflow.com/questions/18685576/php-what-is-the-best-and-easiest-way-to-check-if-directory-is-empty-or-not?rq=1 has a comment recommending `rmdir` which "should" fail if the directory is non-empty. Check your permission level! – Carl Witthoft Feb 05 '14 at 15:28
  • Yeah, but... is it faster for empty or "not-huge" directories, and which are you more likely to run across? :-( – Carl Witthoft Feb 05 '14 at 15:52
  • @CarlWitthoft: Yes, and bubble sort is faster than quicksort for "not-huge" data... I've wondered, I'm curious, I've posted a question. Let's wait and see if there is a good answer. – krlmlr Feb 05 '14 at 15:56

1 Answers1

2

How about if(length(dir(all.files=TRUE)) ==0) ?

I'm not sure what you qualify as "fast," but if dir takes a long time, someone is abusing your filesystem :-(.

Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
  • 1
    The OP mentionnend that he wants to avoid enumerating the entire contents of the directory (and `dir` == `list.files`) and without `all.files = TRUE` you will miss hidden files – dickoa Feb 05 '14 at 12:39
  • @dickoa good point, sorry. But I fail to see what's so bad about enumerating (since I'm not returning an object to the environment) – Carl Witthoft Feb 05 '14 at 12:45
  • Not that bad actually but may be the OP is just looking for a more efficient alternative to this solution. I use linux and for this kind of task if I want to do it fast I use the shell through `system`. – dickoa Feb 05 '14 at 12:56
  • @CarlWitthoft: I'm really nitpicking here. Enumerating perhaps is not noticeable performance-wise expect if a slow file system is used and the directory contains many entries. (You call it "abuse", but it might happen.) – krlmlr Feb 05 '14 at 13:21
  • `dir` is much faster than calling the operating system yourself. Compare `system.time(dir())` with `system.time(system("ls", TRUE))`. – Richie Cotton Feb 05 '14 at 14:29
  • But you still get `.` and `..` when you get all files. – James Feb 05 '14 at 15:06
  • @James --- Oh, well, guess it's still operating-system specific. – Carl Witthoft Feb 05 '14 at 15:27
  • @RichieCotton: I have edited my question to add a comparison for a huge directory that shows that `system("ls | head -n 1")` can be faster than `dir()`. – krlmlr Feb 05 '14 at 15:49
  • `head` doesn't exist in the Windows command shell. I tried the 100k files tests under Win7 with a local SSD, and I still find `dir` to be a little quicker. On a network drive, most of the time involves passing data across a network and it doesn't matter which solution you use. So as with many performance questions, there are no guaranteed wins; it depends on your setup. Interesting question though. – Richie Cotton Feb 05 '14 at 16:49
  • @RichieCotton: Sorry, it has to be `ls -f` to avoid sorting. – krlmlr Feb 05 '14 at 17:10
  • S/He wants to be portable, and this does not work under Windows unless you add `no.. = FALSE` – antonio May 20 '14 at 18:10