0

Here is a simplified version of the problem I am facing: Let's say I have a function that accepts a path to a directory and then removes all of its content except (optionally) a designated "keep file",

import os

KEEP_FILE_CONSTANT = '.gitkeep'

def clear_directory(directory: str, retain: bool = True) -> bool:
    try:
        filelist = list(os.listdir(directory))
        for f in filelist:
            filename = os.path.basename(f)
            if retain and filename == KEEP_FILE_CONSTANT:
                continue
            os.remove(os.path.join(directory, f))
        return True
    except OSError as e:
        print(e)
        return False

I am trying to write a unit test for this function that verifies the os.remove was called. This is currently how I am testing it:

import pytest
from unittest.mock import ANY

@pytest.mark.parametrize('directory', [
     ('random_directory_1'),
     ('random_directory_2'),
     # ...
])
@patch('module.os.remove')
def test_clear_directory(delete_function, directory):
    clear_directory(directory)
    delete_function.assert_called()
    delete_function.assert_called_with(ANY)

Ideally, what I would like to assert in the test is the delete_function was called with an argument containing directory, i.e. something like,

delete_function.assert_called_with(CONTAINS(directory)) 

or something of that nature. I have been looking at PyHamcrest, specifically the contains_string function, but I am not sure how to apply it here or if it's even possible.

Is there any way to implement a CONTAINS matcher for this use case?

Grant Moore
  • 153
  • 1
  • 10
  • 2
    You could iterate over `call_args_list` and check the call args (something like `for call_args in delete_function.call_args_list: assert directory in call_args.args[0]`). – MrBean Bremen Jun 30 '22 at 16:15

1 Answers1

1

This isn't a direct answer to your question, but if I were writing these tests, I would take a different approach:

  • Make a temporary directory.
  • Actually delete the files.
  • Check that only the expected files remain.

This way, you're testing the actual behavior you want, and not depending on internal implementation details (i.e. the fact that you're using os.remove() instead of some alternative like Pathlib.unlink()).

If you're not familiar, pytest provides a tmp_path fixture for exactly this kind of test. However, it's still a bit of a chore to fill in the temporary directory, especially if you want to test a variety of nested file hierarchies. I wrote a fixture called tmp_files to make this easier, though, and I think it might be a good fit for your problem. Here's how the test would look:

import pytest

# I included tests for nested files, even though the sample function you
# provided doesn't support them, just to show how to do it.

@pytest.mark.parametrize(
    'tmp_files, to_remove, to_keep', [
        pytest.param(
            {'file.txt': ''},
            ['file.txt'],
            [],
            id='remove-1-file',
        ),
        pytest.param(
            {'file-1.txt': '', 'file-2.txt': ''},
            ['file-1.txt', 'file-2.txt'],
            [],
            id='remove-2-files',
        ),
        pytest.param(
            {'file.txt': '', 'dir/file.txt': ''},
            ['file.txt', 'dir/file.txt'],
            [],
            id='remove-nested-files',
            marks=pytest.mark.xfail,
        ),
        pytest.param(
            {'.gitkeep': ''},
            [],
            ['.gitkeep'],
            id='keep-1-file',
        ),
        pytest.param(
            {'.gitkeep': '', 'dir/.gitkeep': ''},
            [],
            ['.gitkeep', 'dir/.gitkeep'],
            id='keep-nested-files',
            marks=pytest.mark.xfail,
        ),
    ],
    indirect=['tmp_files'],
)
def test_clear_directory(tmp_files, to_remove, to_keep):
    clear_directory(tmp_files)

    for p in to_remove:
        assert not os.path.exists(tmp_files / p)
    for p in to_keep:
        assert os.path.exists(tmp_files / p)

To briefly explain, the tmp_files parameters specify what files to create in each temporary directory, and are simply dictionaries mapping file names to file contents. Here all the files are simple text files, but it's also possible to create symlinks, FIFOs, etc. The indirect=['tmp_files'] argument is easy to miss but very important. It tells pytest to parametrize the tmp_files fixture with the tmp_files parameters.

Kale Kundert
  • 1,144
  • 6
  • 18