I have the following project structure,
work_directory:
merge.py
a_package
(i.e. a python file merge.py
and a directory a_package
under the directory "work_directory")
I wrote a MapReduce job using MRJob in merge.py, in which I need to import a_package
, like from a_package import something
.
But I have difficulty uploading a_package
into hadoop.
I have tried this method(https://mrjob.readthedocs.io/en/latest/guides/writing-mrjobs.html#using-other-python-modules-and-packages): I wrote
class MRPackageUsingJob(MRJob):
DIRS = ['a_package']
and import code from inside a mapper
def mapper(self, key, value):
from a_package import something
I also tried this one: https://mrjob.readthedocs.io/en/latest/guides/setup-cookbook.html#uploading-your-source-tree
But neither of them work, it keeps showing ImportError: No module named a_package
.
What should I do?