40

Is there a maximum number of inodes in a single directory?

I have a directory of over 2 million files and can't get the ls command to work against that directory. So now I'm wondering if I've exceeded a limit on inodes in Linux. Is there a limit before a 2^64 numerical limit?

Karel
  • 369
  • 1
  • 5
  • 20
Mark Witczak
  • 1,563
  • 2
  • 14
  • 13
  • 1
    You mean a maximum number of entries in a single directory, right? After all, you could make 2 million hardlinks to the same file in one directory, and that would cause the same problem. – Charles Duffy Sep 17 '08 at 04:08

10 Answers10

58

df -i should tell you the number of inodes used and free on the file system.

tonylo
  • 3,311
  • 3
  • 28
  • 27
  • This is not the question. The number of entries for example in ext3/ext4 file systems has a fixed limit but i forgot which it was, something in the millions. I think 16 millions or so, so yes it is possible to run into this limit. – Lothar Jun 07 '21 at 14:13
19

Try ls -U or ls -f.

ls, by default, sorts the files alphabetically. If you have 2 million files, that sort can take a long time. If ls -U (or perhaps ls -f), then the file names will be printed immediately.

Robᵩ
  • 163,533
  • 20
  • 239
  • 308
11

No. Inode limits are per-filesystem, and decided at filesystem creation time. You could be hitting another limit, or maybe 'ls' just doesn't perform that well.

Try this:

tune2fs -l /dev/DEVICE | grep -i inode

It should tell you all sorts of inode related info.

Jordi Bunster
  • 4,886
  • 3
  • 28
  • 22
5

What you hit is an internal limit of ls. Here is an article which explains it quite well: http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/

DragonTux
  • 732
  • 10
  • 22
3

Maximum directory size is filesystem-dependent, and thus the exact limit varies. However, having very large directories is a bad practice.

You should consider making your directories smaller by sorting files into subdirectories. One common scheme is to use the first two characters for a first-level subdirectory, as follows:

${topdir}/aa/aardvark
${topdir}/ai/airplane

This works particularly well if using UUID, GUIDs or content hash values for naming.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
1

As noted by Rob Adams, ls is sorting the files before displaying them. Note that if you are using NFS, the NFS server will be sorting the directory before sending it, and 2 million entries may well take longer than the NFS timeout. That makes the directory unlistable via NFS, even with the -f flag.

This may be true for other network file systems as well.

While there's no enforced limit to the number of entries in a directory, it's good practice to have some limit to the entries you anticipate.

mpez0
  • 2,815
  • 17
  • 12
0

For NetBackup, the binaries that analyze the directories in clients perform some type of listing that timeouts by the enormous quantity of files in every folder (about one million per folder, SAP work directory).

My solution was (as Charles Duffy write in this thread), reorganize the folders in subfolders with less archives.

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
mario
  • 1
0

Can you get a real count of the number of files? Does it fall very near a 2^n boundry? Could you simply be running out of RAM to hold all the file names?

I know that in windows at least file system performance would drop dramatically as the number of files in the folder went up, but I thought that linux didn't suffer from this issue, at least if you were using a command prompt. God help you if you try to get something like nautilus to open a folder with that many files.

I'm also wondering where these files come from. Are you able to calculate file names programmatically? If that's the case, you might be able to write a small program to sort them into a number of sub-folders. Often listing the name of a specific file will grant you access where trying to look up the name will fail. For example, I have a folder in windows with about 85,000 files where this works.

If this technique is successful, you might try finding a way to make this sort permanent, even if it's just running this small program as a cron job. It'll work especially well if you can sort the files by date somewhere.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
0

Unless you are getting an error message, ls is working but very slowly. You can try looking at just the first ten files like this:

ls -f | head -10

If you're going to need to look at the file details for a while, you can put them in a file first. You probably want to send the output to a different directory than the one you are listing at the moment!

ls > ~/lots-of-files.txt

If you want to do something to the files, you can use xargs. If you decide to write a script of some kind to do the work, make sure that your script will process the list of files as a stream rather than all at once. Here's an example of moving all the files.

ls | xargs -I thefilename mv thefilename ~/some/other/directory

You could combine that with head to move a smaller number of the files.

ls | head -10000 | xargs -I x mv x /first/ten/thousand/files/go/here

You can probably combine ls | head into a shell script to that will split up the files into a bunch of directories with a manageable number of files in each.

Joseph Bui
  • 1,701
  • 15
  • 22
-1

Another option is find:

find . -name * -exec somcommands {} \;

{} is the absolute filepath.

The advantage/disadvantage is that the files are processed one after each other.

find . -name * > ls.txt

would print all filenames in ls.txt

find . -name * -exec ls -l {} \; > ls.txt

would print all information form ls for each file in ls.txt

markus
  • 1
  • You have to include the wildchar within single quotes if you don't want it to be expanded by the shell (it can be quite long if there are +2 millions files!) – Didier Trosset Aug 24 '10 at 09:27
  • 1
    You should learn about the `xargs` command. It is much more efficient than the -exec option of the find command. – Didier Trosset Aug 24 '10 at 09:27
  • 2
    @Didier Trosset, new versions of the POSIX standard support `find ... -exec {} +` (rather than `-exec {} ;`), which has efficiency similar to xargs. – Charles Duffy Jan 04 '12 at 16:04