I'm new to spark and cannot import wikipedia package from sift.corpora. I'm getting this error. " ImportError: No module named 'sift.corpora ". Here is the notebook I'm working on. Thank you for your help!
Asked
Active
Viewed 127 times
1 Answers
1
In the first instance this is a python issue and not a spark issue. The error message is telling you that it cannot find the module you want to import. The sift documentation tells you that you have to install the python package before you can use it with:
pip install git+http://git@github.com/wikilinks/sift.git
You have to execute this command on every spark node as spark is a distributed environment.

cronoik
- 15,434
- 3
- 40
- 78