I want to save image files (like jpeg, png etc) on HDFS (Hadoop File System). I tried two ways :
- Saved the image files as it is (i.e in the same format) into HDFS using
put
command. The full command was :hadoop fs -put /home/a.jpeg /user/hadoop/
. It was successfully placed. - Converted these image files into Hadoop's
Sequence File
format & then saved in HDFS usingput
command.
I want to know which format should be used to save in HDFS.
And what are the pros of using Sequence File
format. One of the advantage that I know is that it is splittable. Is there any other ?