0

I am working on a python project, where we read parquet files from azure datalake and perform the required operations. We have defined a common parquet file reader function which we use to read such files. I am using MagicMock to mock objects and write test cases, when I ran the test cases some tests were failing because they have mocked values of some other test cases. I didn't understand this behavior completely. Below is my code

utils/common.py

def parquet_reader(parquet_path: str) -> Tuple[Dict, int]:
    table = pq.read_table(parquet_directory_path)
    return table.to_pydict(), table.num_rows

-------Test01-------------

PARQUET_DATA = {
    "se_vl": ["530"],
    "l_depart": ["028"],
    "r_code": ["f5"],
    "r_dir": ["W"],
}

NUM_RECORDS = 1

TEST_CAF_DATA = (PARQUET_DATA, NUM_RECORDS)

def test_read_from_source():
    common.parquet_reader = MagicMock(return_value=TEST_CAF_DATA)
    obj = next(read_from_source('path to file'))

    assert obj.se_vl == "530"

-------Test02-------------

from utils import common

SAMPLE_DATA_PATH = "src/schedule/sample_data"

def test_parquet_reader():
    rows, _ = common.parquet_reader(SAMPLE_DATA_PATH)

    assert rows["key_id"][0] == 345689865
    assert rows["com"][0] == "UP"

When I run all the tests(total of 240 test) than the variable 'rows' in test02 holds the data for test01 (PARQUET_DATA). However I fixed the above issue using patch. But still confused as why such behaviour using MagicMock?

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
Kashyap
  • 385
  • 3
  • 13
  • `patch` reverts the mocking after the test automátically, but your code doesn't do that, so it stays mocked. This is why you should use `patch` in such cases. – MrBean Bremen Nov 17 '21 at 23:18
  • Thanks, @MrBeanBremen I got that mocks are not cleared but was wondering how can mock from one test file can affect mocks in another test file. If I was mocking a method multiple places in the same file, then it was understood that mocks won't be cleared and cause a mismatch. What really caught my attention was how com mocks from different files are colliding with each other. – Kashyap Nov 18 '21 at 08:35
  • 1
    You are running the tests together, and if you mock the imported module in one test, it will not be reimported for the second test module (imports are cached), so it stays mocked. – MrBean Bremen Nov 18 '21 at 08:42
  • Thanks for the explanation @MrBeanBremen, but is it a good idea to cache imports in test cases, it may cause colliding effects like the one I faced. – Kashyap Nov 18 '21 at 09:04
  • 1
    That is just standard Python functionality how imports work and has nothing to do with tests. Or you could say, that the way it works demands that you reset your mocks for components used elsewhere. – MrBean Bremen Nov 18 '21 at 09:30

0 Answers0