I am developing a set of python scripts to pre-process a dataset then produce a series of machine learning models using scikit-learn. I would like to develop a set of unittests to check the data pre-processing functions, and would like to be able to use a small test pandas dataframe for which I can determine the answers for and use it in assert statements.
I cannot seem to get it to load the dataframe and to pass it to the unit tests using self. My code looks something like this;
def setUp(self):
TEST_INPUT_DIR = 'data/'
test_file_name = 'testdata.csv'
try:
data = pd.read_csv(INPUT_DIR + test_file_name,
sep = ',',
header = 0)
except IOError:
print 'cannot open file'
self.fixture = data
def tearDown(self):
del self.fixture
def test1(self):
self.assertEqual(somefunction(self.fixture), somevalue)
if __name__ == '__main__':
unittest.main()
Thanks for the help.