0

Can we use hadoop to run SIFT on multiple images?

SIFT takes ~ 1s on each image to extract keypoints and its descriptors. Considering that each run is independent of others and runtime of 1 run cannot be reduced, can we reduce runtime anyhow?

Multithreading reduces runtime by a factor of number of core processors you have. We can run each image on each processor.

Can hadoop be used anyhow to parallelize run on multiple images? If yes, by what factor can it reduce runtime supposing we have 3 clusters?

Ankit Nayan
  • 363
  • 7
  • 18
  • 4
    This is a generic batch processing question more suitable for stackoverflow. In the end, it doesn't matter that you are running SIFT or any other image processing algorithm, since you are interested in speeding up the processing of a large collection of small images - not trying to parallelize the processing of a single big image. This is done simply by firing more analysis tasks concurrently. Hadoop is probably overkill for that - you just need mappers. The speed-up depends on the number of nodes, cores & how many cores your SIFT implementation already keeps busy. Have those numbers at hand. – pichenettes May 14 '14 at 12:17

1 Answers1

0

Could you give some good references for mappers? What are the kind of mappers which would be relevant for this job?

godot101
  • 305
  • 1
  • 4
  • 12