Can I use Hadoop & MapReduce in Jupyter/IPython? Is there something similar to what PySpark for Spark is?
Asked
Active
Viewed 4,823 times
3
-
We do have python api for hadoop http://crs4.github.io/pydoop/ . Can your question be more specific on what you try to achieve ? – Govind Aug 13 '15 at 02:08
1 Answers
3
Of course you can. Many Frameworks like Hadoop Streaming, mrjob and dumbo to name a few. The techical aspect of including these in Jupyter should concist of either subprocess.Popen()
calls or typical python imports, depending on the framework.
A nice overview/critique of some of these Frameworks can be found in this cloudera blogpost.

SergeyR
- 468
- 5
- 10

Dimitris Fasarakis Hilliard
- 150,925
- 31
- 268
- 253