All, I don't know what limit I have hit or whether this is an issue with valgrind
, libc
, or me
, but I need to know if this is reproducible, and if so, where this issue lies. I've boiled the issue down to an MCVE producable on 2 of my AMD boxes. Basically, I am dynamically allocating struct dirent *
pointers and then allocating a struct dirent
for each successful readdir
. valgrind
has no complaint for 1017
, but then on number 1018
, I get an invalid read
error (no reallocation involved), e.g.
==9881== Invalid read of size 8
==9881== at 0x4C2F316: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9881== by 0x40098E: main (readdir_mcve.c:35)
==9881== Address 0x51df070 is 0 bytes after a block of size 32,816 alloc'd
==9881== at 0x4C2ABD0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9881== by 0x4EE93E3: __alloc_dir (in /usr/lib/libc-2.23.so)
==9881== by 0x4EE94D2: opendir_tail (in /usr/lib/libc-2.23.so)
==9881== by 0x400802: main (readdir_mcve.c:9)
(the block of size 32,816
looks curious, but I haven't found any help breaking it down)
The code takes the directory name to open as the first argument, and then the limit of files to read as the second (default is 1000
):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <dirent.h>
int main (int argc, char **argv) {
DIR *dp = opendir (argc > 1 ? argv[1] : "."); /* open directory (. default) */
struct dirent *de = NULL, **dlist = NULL; /* ptr and ptr2ptr to dirent */
size_t nptrs = argc > 2 ? (size_t)strtoul (argv[2], NULL, 10) : 1000,
i = 0, idx = 0; /* index, allocation counter */
if (!dp) {
fprintf (stderr, "error: opendir failed.\n");
return 1;
}
/* allocate nptrs dirent pointers */
if (!(dlist = calloc (nptrs, sizeof *dlist))) {
fprintf (stderr, "error: virtual memory exhausted - dlist\n");
return 1;
}
while ((de = readdir (dp))) {
/* skip dot files */
if (!strcmp (de->d_name, ".") || !strcmp (de->d_name, ".."))
continue;
if (!(dlist[idx] = calloc (1, sizeof **dlist))) { /* alloc dirent */
fprintf (stderr, "error: dlist memory allocation failed\n");
return 1;
}
memcpy (dlist[idx++], de, sizeof *de); /* copy de to dlist[idx] */
if (idx == nptrs) /* post-check/realloc, insures sentinel NULL */
break;
}
closedir (dp);
for (i = 0; i < idx; i++) {
printf (" file[%3zu] : %s\n", i, dlist[i]->d_name);
free (dlist[i]);
}
free (dlist);
return 0;
}
You can create a simple test directory with:
$ mkdir readdir_tst
$ for i in {1..1024}; do
printf -v fname "file%04d" "$i"
touch "readdir_tst/$fname"
done
Then all is well reading 1014
filenames (1017
allocations):
$ valgrind ./bin/readdir_mcve readdir_tst 1014 > /dev/null
==9880== Memcheck, a memory error detector
==9880== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9880== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==9880== Command: ./bin/readdir_mcve readdir_tst 1014
==9880==
==9880==
==9880== HEAP SUMMARY:
==9880== in use at exit: 0 bytes in 0 blocks
==9880== total heap usage: 1,017 allocs, 1,017 frees, 328,944 bytes allocated
==9880==
==9880== All heap blocks were freed -- no leaks are possible
==9880==
==9880== For counts of detected and suppressed errors, rerun with: -v
==9880== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
But on file 1015
(allocation 1018
) I hit an __alloc_dir
issue and valgrind throws an Invalid read of size 8 ... is 0 bytes after a block of size 32,816 alloc'd
:
$ valgrind ./bin/readdir_mcve readdir_tst 1015 > /dev/null
==9881== Memcheck, a memory error detector
==9881== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9881== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==9881== Command: ./bin/readdir_mcve readdir_tst 1015
==9881==
==9881== Invalid read of size 8
==9881== at 0x4C2F316: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9881== by 0x40098E: main (readdir_mcve.c:35)
==9881== Address 0x51df070 is 0 bytes after a block of size 32,816 alloc'd
==9881== at 0x4C2ABD0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9881== by 0x4EE93E3: __alloc_dir (in /usr/lib/libc-2.23.so)
==9881== by 0x4EE94D2: opendir_tail (in /usr/lib/libc-2.23.so)
==9881== by 0x400802: main (readdir_mcve.c:9)
==9881==
==9881==
==9881== HEAP SUMMARY:
==9881== in use at exit: 0 bytes in 0 blocks
==9881== total heap usage: 1,018 allocs, 1,018 frees, 329,232 bytes allocated
==9881==
==9881== All heap blocks were freed -- no leaks are possible
==9881==
==9881== For counts of detected and suppressed errors, rerun with: -v
==9881== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
The code continues to read and print all directory entries just fine, but it is the valgrind
error that has me baffled. I had it reallocating instead of calling break
on reaching the allocation limit and it handles the 4000+
files in /usr/bin
without any issue other than the valgrind
error. (that was stripped as not relevant to the MCVE). The closest thing I've found on SO is Valgrind malloc leaks, but that wasn't applicable here. Can anyone else reproduce this and if so, is this valgrind
, libc
or me
?
note: I get the same results with libc-2.18.so
.
GNU libc dirent.h offers a bit more information
After being pointed in the right direction by the answer and continuing the search a bit, it seems there are a number of ways libc may determine the length of d_name
. It will depend on various definitions available to the compiler. In dirent.h explains:
46 /* This file defines `struct dirent'.
47
48 It defines the macro `_DIRENT_HAVE_D_NAMLEN' iff there is a `d_namlen'
49 member that gives the length of `d_name'.
...
59 */
...
67 /* These macros extract size information from a `struct dirent *'.
68 They may evaluate their argument multiple times, so it must not
69 have side effects. Each of these may involve a relatively costly
70 call to `strlen' on some systems, so these values should be cached.
71
72 _D_EXACT_NAMLEN (DP) returns the length of DP->d_name, not including
73 its terminating null character.
74
75 _D_ALLOC_NAMLEN (DP) returns a size at least (_D_EXACT_NAMLEN (DP) + 1);
76 that is, the allocation size needed to hold the DP->d_name string.
77 Use this macro when you don't need the exact length, just an upper bound.
78 This macro is less likely to require calling `strlen' than _D_EXACT_NAMLEN.
79 */
80
81 #ifdef _DIRENT_HAVE_D_NAMLEN
82 # define _D_EXACT_NAMLEN(d) ((d)->d_namlen)
83 # define _D_ALLOC_NAMLEN(d) (_D_EXACT_NAMLEN (d) + 1)
84 #else
85 # define _D_EXACT_NAMLEN(d) (strlen ((d)->d_name))
86 # ifdef _DIRENT_HAVE_D_RECLEN
87 # define _D_ALLOC_NAMLEN(d) (((char *) (d) + (d)->d_reclen) - &(d)->d_name[0])
88 # else
89 # define _D_ALLOC_NAMLEN(d) (sizeof (d)->d_name > 1 ? sizeof (d)->d_name : \
90 _D_EXACT_NAMLEN (d) + 1)
91 # endif
92 #endif
...
While there a number of different compile paths this can take based on the various defines, the one that is the most readable occurs if __USE_XOPEN2K8
is set:
221 #ifdef __USE_XOPEN2K8
222 ...
230 # ifdef __USE_MISC
231 # ifndef MAXNAMLEN
232 /* Get the definitions of the POSIX.1 limits. */
233 # include <bits/posix1_lim.h>
234
235 /* `MAXNAMLEN' is the BSD name for what POSIX calls `NAME_MAX'. */
236 # ifdef NAME_MAX
237 # define MAXNAMLEN NAME_MAX
238 # else
239 # define MAXNAMLEN 255
240 # endif
241 # endif
242 # endif
So in this case, d_name
is either NAME_MAX
or 255
depending on the NAME_MAX
definition (and thus set to 256
by the _D_ALLOC_NAMLEN (DP)
macro). Thanks to unwind for pointing me in the right direction. I don't know if we can ever know the exact answer of why 1017
struct dirent
allocations occur without issue and why valgrind
begins to complain on number 1018
, but at least we understand now where the source of the issue arises and why copying a struct dirent
with memcpy
may pose problems.