1
def clean_dir(directories):
    for directory in directories:
        for root, dirs, files in os.walk(directory, topdown=False):
            for name in files:
                os.remove(os.path.join(root, name))
            for name in dirs:
                os.rmdir(os.path.join(root, name))

I have this function and I am trying to create unit tests for it. Any help?

Daniil Fajnberg
  • 12,753
  • 2
  • 10
  • 41
Praveen
  • 31
  • 3
  • Unit-testing this with a mocked-out `os` doesn't seem that useful since bugs are more likely to be due to misunderstandings of the interface than they are in the connecting code (which is just simple iteration). Are you sure you don't want to just use `shutil.rmtree`? https://docs.python.org/3/library/shutil.html#shutil.rmtree – Samwise Oct 01 '22 at 17:57
  • 1
    help with what ? what error are you esperiencing ? https://stackoverflow.com/help/minimal-reproducible-example – D.L Oct 01 '22 at 18:01
  • If you want to test your code is actually removing all subdirectories and files within the provided directories, you could create a temporary directory (https://docs.python.org/3/library/tempfile.html#tempfile.TemporaryDirectory) copy some test folder structures into it, run clean_dir on those structures, and check if everything is as expected afterwards. you can use `TempDirectory` within a with statement so it gets cleaned up afterwards whatever happens. – John M. Oct 01 '22 at 18:07

1 Answers1

0

The way I see it, you have three options, all of which were mentioned in the comments already.

1. Don't reinvent the wheel

Best practice is to use well tested and established tools that are already at your disposal. In this case, as pointed out by @Samwise, shutil.rmtree does (almost) exactly what you do in your function. The difference is that your function does not delete any of the directories, only their contents, while rmtree does.

If you are fine with that or are OK with simply re-creating the deleted top-directories afterwards, your function arguably becomes unnecessary since you can simply call rmtree in a loop. If not, I'll add my suggested substitute for your function at the end.

2. Create temporary directories and files

As suggested by @john-m, you can use tempfile.TemporaryDirectory to set up a directory tree during your test. Since your function does not call any other custom functions, just deletes files/directories, I would consider this a perfectly valid and pragmatic way to unit-test it. Here is an example:

from pathlib import Path
from tempfile import TemporaryDirectory
from unittest import TestCase


class MyTest(TestCase):
    def test_clean_dir(self) -> None:
        with TemporaryDirectory() as tmp_dir_name:
            # Create directories & files to be deleted:
            d1 = Path(tmp_dir_name, ".hidden_dir", "foo")
            d2 = Path(tmp_dir_name, "regular_dir")
            d1.mkdir(parents=True)
            d2.mkdir()
            Path(d1, "file").touch()
            Path(d1, ".hidden_file").touch()
            Path(d2, "spam.txt").touch()
            Path(tmp_dir_name, ".hidden").touch()
            Path(tmp_dir_name, "regular").touch()

            # Run our function, then ensure the directory is empty:
            clean_dir([tmp_dir_name])
            self.assertFalse(any(Path(tmp_dir_name).iterdir()))

This relies on creating a test setup that is as exhaustive as you can reasonably assume from your actual use case. Here I just threw in a few hidden directories and files, but I think this should cover you.

3. Mock os-functions

Since most of the functions you use have side-effects, it would also be a valid approach to mock them and verify that they have been called appropriately. (@Samwise briefly mentioned this option.)

Here is important to know what you want from the existing functions. You should not test third-party functions in your unit tests, i.e. you assume that they work as advertised. Here is my way of doing this:

import os
from unittest import TestCase
from unittest.mock import MagicMock, call, patch


class MyTest(TestCase):

    @patch("os.rmdir")
    @patch("os.remove")
    @patch("os.walk")
    def test_clean_dir_with_mocks(
            self,
            mock_walk: MagicMock,
            mock_remove: MagicMock,
            mock_rmdir: MagicMock,
    ) -> None:
        # Set up mock directory tree:
        root_0 = "foo"
        dir_0_0 = "abc"
        dir_0_1 = "def"
        file_0_0 = "0.txt"
        file_0_1 = "0.txt"

        root_1 = "bar"
        dir_1_0 = "x"
        dir_1_1 = "y"
        file_1_0 = "1.txt"
        file_1_1 = "1.txt"

        mock_walk.return_value = [
            (root_0, [dir_0_0, dir_0_1], [file_0_0, file_0_1]),
            (root_1, [dir_1_0, dir_1_1], [file_1_0, file_1_1]),
        ]
        test_dir = "spam"

        # Ensure relevant functions were called correctly:
        self.assertIsNone(
            clean_dir([test_dir])
        )
        mock_walk.assert_called_once_with(test_dir, topdown=False)
        mock_remove.assert_has_calls([
            call(os.path.join(root_0, file_0_0)),
            call(os.path.join(root_0, file_0_1)),
            call(os.path.join(root_1, file_1_0)),
            call(os.path.join(root_1, file_1_1)),
        ])
        mock_rmdir.assert_has_calls([
            call(os.path.join(root_0, dir_0_0)),
            call(os.path.join(root_0, dir_0_1)),
            call(os.path.join(root_1, dir_1_0)),
            call(os.path.join(root_1, dir_1_1)),
        ])

This may or may not be overkill with the two iterations of os.walk. The important thing is to correctly and completely check the mock objects for calls after you run your function. Check out the unittest.mock documentation for details.


Alternative & suggestion

If you want the exact same functionality, but with a more concise and readable implementation, here is one that utilizes the wonderful pathlib module as well as the aforementioned shutil.rmtree:

from pathlib import Path
from shutil import rmtree
from typing import Union


def clean_dir(*directories: Union[str, Path]) -> None:
    for directory in directories:
        for element in Path(directory).iterdir():
            element.unlink() if element.is_file() else rmtree(element)

The variadic version of *directories allows the function to be called with an arbitrary number of strings (or Path objects), each representing a directory to "clean", e.g. clean_dir("foo/", "bar/", "baz/"). I usually prefer this kind of interface as it gives me the option of calling the function without constructing a list or tuple or other iterable around my arguments. If I do have a list of paths, I can still call it simply by doing clean_dir(*list_of_paths). But if you don't want this, simply omit the * (and adjust the type annotation to Iterable[Union[str, Path]]).

Since I usually work with Path objects anyway, when doing anything with the filesystem, I like to annotate functions that take path arguments accordingly.

To test this function, I would definitely use the option 2, i.e. run it on an actual, real, temporary test directory. The function passes the test method I provided above.


Hope this helps.

Daniil Fajnberg
  • 12,753
  • 2
  • 10
  • 41