4

I know that there are some tricks to avoid the shell's limit, which leads to "argument list too long", but I want to understand why the limit hits in my case (even though it should not). As far as I know the limit of chars in an argument for a command should be able to be determined by the following steps:

  1. Get the maximum argument size by getconf ARG_MAX
  2. Subtract the content of you environment retrieved by env|wc -c

On my machine with Fedora 30 and zsh 5.7.1 this should allow me argument lists with a length of up to 2085763 chars. But I already hit the limit with only 1501000 chars. What did I miss in my calculation?


Minimal working example for reproduction:

Setting up files:

$ for i in {10000..100000}; do touch Testfile_${i}".txt"; done
$ ls Testfile*
zsh: argument list too long: ls

No I deleted stepwise (1000 files per step) files to check when the argument line was short enough to be handled again

for i in {10000..100000..1000}; do echo $(ls|wc -l); rm Testfile_{$i..$((i + 1000))}.txt; ls Testfile_*|wc -l; done

The message zsh: argument list too long: ls stops between 79000 and 78000 remaining files. Each filename has a length of 18 chars (19, including the separating whitespace), so in total at this moment the argument line should have a total length of 79000*19=1501000 respectively 78000*19=1482000 chars.

This result is the same magnitude in comparison to the expected value of 2085763 chars but still it's slighty off. What could explain the difference of 500000 chars?


ADDENDUM1:

Like suggested in the comments I ran xargs --show-limits and the output fits round about my expectation.

$ xargs --show-limits
Your environment variables take up 4783 bytes
POSIX upper limit on argument length (this system): 2090321
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2085538
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647

ADDENDUM2:

Following the comment of @Jens I now added 9 Bytes additional overhead to the words (8 Bytes for the pointer, 1 for the terminating NUL-Byte). Now I get the following results (I do not know, how the whitespace is handled, for the moment I leave it out): 79000*(18+9)= 2133000 78000*(18+9)= 2106000

Both values are much closer to the theoretical limit than before...indeed, they are even a bit above it. So together with some safety margin I'm more confident to preestimate the maximal argument length.


Further reading:

There are more posts about this topic, of which none answers the question in a satisfying way, but still they provide good material:

Marlon Richert
  • 5,250
  • 1
  • 18
  • 27
Jannek S.
  • 365
  • 3
  • 16
  • 1
    I recall there is usually a limit to the number "words" (as well as overall character count). Try searching for `Limits`. In some systems it is documented in the man page for the shell you are using. But guessing there may also be a `getconf` argument that will show that too. Sorry I don't have time to research that. Good luck. – shellter Jul 29 '19 at 12:16
  • 1
    @JannekS.: What does ` xargs --show-limits` say on your system about the _maximum length of a command_? – user1934428 Jul 30 '19 at 06:29
  • @JannekS : What is `getconf ARG_MAX` on your system? Your calculation seems to be odd. The ARG_MAX value itself should be the maximum number of parameters, which is more relevant than the maximum command length size. On my system, this is 32000. – user1934428 Jul 30 '19 at 06:36
  • @user1934428 I appended the output of `xargs --show-limits` to my post. `getconf ARG_MAX` returns on my system `2097152`. Did you really mean `ARG_MAX` because your value seems to me quite low...or maybe I misunderstand the meaning of this output – Jannek S. Jul 30 '19 at 08:19
  • Wild guess: for each argument, a pointer needs to be in `char *argv[]`, and each string is NUL-terminated. With, say, 1000 words on a 64bit system, this adds 9000 bytes of overhead to the actual words. Did you take that into account? – Jens Jul 30 '19 at 08:46
  • @JannekS. Indeed, `ARG_MAX` seems to be the one which is applicable here, see forinstance [this explanation](https://www.in-ulm.de/~mascheck/various/argmax/). And, ARG_MAX of about 2 million seems to be quite a lot. On the page I linked to, you find near the bottom typical ARG_MAX values for various systems, and I am surprised that your `getconf`` reports such a high value. – user1934428 Jul 30 '19 at 11:01
  • @user1934428 Thanks for the link, the content is really interesting..and I see now your concern with my value for ARG_MAX. I "appreciate" the statement _"The most reliable way to get the currently available space is to test the success of an exec() with increasing length of arguments until it fails."_ – Jannek S. Jul 30 '19 at 12:13
  • @Jens I did not consider this in my calculation and redid them, I'll edit my post according to this – Jannek S. Jul 30 '19 at 12:15

1 Answers1

0

If you where looking to count files, this worked for me on Mac OSX Ventura (13.1)

find . -maxdepth 2 -name "*.zip" | wc -l

I had 1039999 zip files and the standard "ls /.zip | wc -l" just died ("zsh: argument list too long: ls")

Romain
  • 9
  • 2