Recursively list directory content, plus check if file is directory

Question

I'm trying to learn assembly, so bear with me if my problems are elementary

The following code scans a directory and prints out all files and directories in it, except those starting with a dot. It seems to work fine.

However, once I uncomment the call scandir line to turn on recursion, it prints out a long list of repeating filenames (see below for details).

Also, I'd like a way to test if the file is a directory or not. How would I do that? As far as I can tell that isn't the problem, for now, since if it's not a directory the call to scandir will just return without doing anything, but the check might become important later on (and it seems a good thing to do anyway).

[SECTION .data]
DirName     db 'test', 0

[SECTION .bss]

[SECTION .text]

extern puts                  ; Externals from glibc standard C library
extern opendir               ; Externals from dirent.h
extern closedir
extern readdir

global main


scandir:
    pushad                  ; Save caller's registers

    push eax                ; Directory is passed in eax
    call opendir            ; Open directory
    add esp, 4
    cmp eax, 0              ; opendir returns 0 on failure
    je .done

    mov ebx, eax            ; Move directory handle to ebx

    .read:
        push ebx            ; Push directory handle
        call readdir        ; Read a file from directory
        add esp, 4          ; Clean up the stack
        cmp eax, 0          ; readdir returns 0 on failure or when done
        je .close

        add eax, 11         ; File name is offset at 11 bytes

        mov cl, byte [eax]  ; Get first char of filename
        cmp cl, 46          ; Ignore files and dirs which begin with a dot
        je .read            ; (., .., and hidden files)

        ;call scandir       ; Call scandir recursively
                            ; If file is not a dir opendir will simply fail

        push eax
        call puts
        add esp, 4

        jmp .read

    .close:
        push ebx                ; Close directory
        call closedir
        add esp, 4
        jmp .done

    .done:
        popad                   ; Restore caller's registers
        ret

main:
    push ebp                ; Set up stack frame for debugger
    mov ebp, esp
    push ebx                ; Must preserve ebp, ebx, esi and edi
    push esi
    push edi
    ; start

    mov eax, DirName
    call scandir

    ; end
    pop edi                 ; Restore saved registers
    pop esi
    pop ebx
    mov esp, ebp            ; Destroy stack frame
    pop ebp
    ret

The directory structure of the test directory is like this:

bar [directory]
    bas.c
test1.c
test2.c
test3.c
foo.txt
test

Without recursion it prints out the files in the test directory as it should, but with recursion it seems to print the following:

test1.c
bar
test3.c
[repeat 3 lines ~1000 times]
test
foo.txt
test2.c
[repeat 3 lines ~1000 times]

Edit: This now mostly works, I think, except that it seems to jump back into the directory below 'test' initially, causing it to list the files there and the files in 'test' twice

[SECTION .data]
ParentDir   db '..', 0
CurrentDir  db '.', 0
DirName     db 'test', 0

[SECTION .bss]

[SECTION .text]
extern puts                 ; Externals from glibc standard C library
extern opendir              ; Externals from dirent.h
extern closedir
extern readdir
extern chdir

global main

scandir:
    pushad                  ; Save caller's registers

    push eax                ; Directory is passed in eax
    call opendir            ; Open directory
    add esp, 4
    cmp eax, 0              ; opendir returns 0 on failure
    je .done

    mov ebx, eax            ; Move directory handle to ebx

    .read:
        push ebx            ; Push directory handle
        call readdir        ; Read a file from directory
        add esp, 4          ; Clean up the stack
        cmp eax, 0          ; readdir returns 0 on failure or when done
        je .close

        add eax, 11         ; File name is offset at 11 bytes

        mov cl, byte [eax]  ; Get first char of filename
        cmp cl, 46          ; Ignore files and dirs which begin with a dot
        je .read            ; (., .., and hidden files)

        cmp byte [eax-1], 4
        jne .notdir

        push eax
        call chdir
        add esp, 4
        mov eax, CurrentDir
        call scandir        ; Call scandir recursively
        jmp .read

    .notdir:
        push eax
        call puts
        add esp, 4

        jmp .read

    .close:
        push ebx                ; Close directory
        call closedir
        add esp, 4

        push ParentDir
        call chdir
        add esp, 4

        jmp .done

    .done:
        popad                   ; Restore caller's registers
        ret

main:
    push ebp                ; Set up stack frame for debugger
    mov ebp, esp
    push ebx                ; Must preserve ebp, ebx, esi and edi
    push esi
    push edi
    ; start

    mov eax, DirName
    call scandir

    ; end
    pop edi                 ; Restore saved registers
    pop esi
    pop ebx
    mov esp, ebp            ; Destroy stack frame
    pop ebp
    ret

score 1 · Accepted Answer · answered Nov 19 '11 at 14:37

1

Your code looks almost fine. The problem is with the path name returned by readdir, which you want to use for the next recursive step. The path name returned is not relative to the current working directory.

The massive output you're seeing is when the test/ directory contains a file named test. When your loop sees that filename, you will pass it to opendir which means you simply reopen the same test/ directory again, causing infinite recursion until you run out of file handles.

One way to solve this is to call chdir after a successful opendir (and also chdir back to the parent directory after closedir) so that the working directory will always point to the one you're currently inspecting.

Also, I'd like a way to test if the file is a directory or not.

The dirent structure returned by readdir has a d_type member (at offset 10) that you can check:

...
call readdir
...
cmp byte [eax+10], 4    ; DT_DIR = 4
jne is_not_directory

answered Nov 19 '11 at 14:37

Martin

37,119
15
73
82

Thanks. I think I have it mostly working now, but it's still causing me some trouble. See the edit in my post above. – Nov 19 '11 at 16:12
Still the same problem - you start off by listing the `test/` directory and then treat those paths as relative to the current working directory, which they are not. A quick fix (in `main`) would be to `chdir` into `DirName` and then `mov eax, CurrentDir` before you `call scandir` – Martin Nov 20 '11 at 05:38
Ah! Or I just move the `chdir` part to the start of `scandir`. Thank you for the help, it's working now – Nov 20 '11 at 10:43
Another good way to handle this is with `openat` (https://man7.org/linux/man-pages/man2/open.2.html), where you can use a relative path relative to an open directory FD instead of having to `chdir` in and `chdir` back again on the way out. That prevents having symlinks to directories make chdir("foo") / `chdir("..")` take you somewhere else. – Peter Cordes Mar 20 '22 at 22:35
Also, `d_type` is an optional optimization that some filesystems don't fill in. Reliable code *must* handle `DT_UNKNOWN` if they want to work on FAT, older XFS, and some other filesystems. [Checking if a dir. entry returned by readdir is a directory, link or file. dent->d\_type isn't showing the type](https://stackoverflow.com/a/29094555) – Peter Cordes Mar 20 '22 at 22:38

Recursively list directory content, plus check if file is directory

1 Answers1

Linked