1

I have two functions in a python file. I want to do some unit tests for these functions using Mock.

def col_rename(col_name):
    reps = ((' ', '_&'), ('(', '*_'), (')', '_*'), ('{', '#_'), ('}', '_#'))
    new_cols = reduce(lambda a, kv: a.replace(*kv), reps, col_name)
    return new_cols

def rename_characters(df):
    df_cols = df.schema.names
    for x in df_cols:
        df = df.withColumnRenamed(x, col_rename(x))
    return df

In the above function withColumnRenamed is a function in pyspark that will return a column after is renames the column name. df is a pyspark data frame.

I am able to do unit testing to the col_rename function.

I am able to do unit testing to the rename_characters function by creating data frames manually in pyspark.

Now I want to do the unit testing using Mock in python.

I have tried something like this below. I am not sure if this is correct or What I am doing is completely wrong

import unittest
from mock import patch

class Test(unittest.TestCase):
    @patch('mymodule.rename_characters')
    def test_func(self, rename_characters_mock):
        rename_characters_mock.return_value = 'mocked values'
        self.assertEqual(return_value, 'mocked_values'))

How can I do Mocking for the unit testing as in the above scenario

User12345
  • 5,180
  • 14
  • 58
  • 105
  • `from mymodule import rename_characters`, r u sure we can `import` a func? – Gang Mar 15 '18 at 02:39
  • @Gang In `pycharm` it gave me unused import statement error, I removed the import statement – User12345 Mar 15 '18 at 02:46
  • almost there. it makes more sense if you want to mock `pyspark.x`, `self.assertEqual(return_value, 'mocked_values'))` the return_value is not defined, are u try to `self.assertEqual(mymodule.rename_charaters(), 'mocked_value'` ? – Gang Mar 15 '18 at 02:52
  • @Gang I want to try `self.assertEqual(mymodule.rename_charaters(), 'mocked_value'` – User12345 Mar 15 '18 at 03:00

1 Answers1

1

you might need this

import mymodule

Outside Test class define a local function

def local_rename_characters():
    return 'mocked_local_values'

This should work

@patch('mymodule.rename_characters')
def test_func(self, rename_characters_mock):
    rename_characters_mock.return_value = 'mocked values'
    self.assertEqual(mymodule.rename_characters(), 'mocked_values')

Alternatives using side_effect

@patch('mymodule.rename_characters')
def test_func(self, rename_characters_mock):
    rename_characters_mock.side_effect = local_rename_characters
    self.assertEqual(mymodule.rename_characters(), 'mocked_local_values')
Gang
  • 2,658
  • 3
  • 17
  • 38
  • If I want to mock the `withColumnRenamed` method that is inside the `rename_characters` function how can I do that – User12345 Mar 15 '18 at 21:26
  • `DataFrame` instance is already a variable, u do not really need to mock. but if u do want to play and learn, `@patch('mymodule.pyspark.sql.DataFrame.withColumnRenamed')` and then `mocked_withColumRenamed.side_effect = local_create_dummy_df`, using `createDataFrame` – Gang Mar 15 '18 at 23:21
  • could you please have a look at `https://stackoverflow.com/questions/49420660/unit-test-pyspark-code-using-python` – User12345 Mar 22 '18 at 05:09