My objective is to apply map-reduce framework to cluster images using hadoop framework.For map-reduce i am using python programming and language and MRJOB package.But i am not able to create the logic of how to process the images. Like i have the images in .tif format.The questions i have is
- How to store the (format of storing)images in hdfs in order to retrive them for map-reduce in python.
- i am not getting the I/O pipeline for using python and hadoop