I've created a python UDF to convert datetimes into different timezones. The script uses pytz which doesn't ship with python (or jython). I've tried a couple things:
- Bootstrapping PIG to install it's own jython and including pytz in that jython installation. I can't get PIG to use the newly installed jython, it keeps reverting to Amazon's jython.
- Setting PYTHONPATH to a local directory where the new modules have been installed
- Setting HADOOP_CLASSPATH/PIG_CLASSPATH to the new installation of jython
Each of these ends up with "ImportError: No module named pytz" when I try to load the UDF script. The script loads fine if I remove pytz so it's definitely the external module that's giving it problems.
Edit: Originally put this as a comment but I thought I'd just make it an edit:
I've tried every way I know of to get PIG to recognize another jython jar. That hasn't worked. Amazon's jython is here: /home/hadoop/.versions/pig-0.9.2/lib/pig/jython.jar, with is recognizing this sys.path: /home/hadoop/lib/Lib. I can't figure out how to build external libraries against this jar.