2

Uploading an image to hbase using Java program, after retrieving the image I found there is difference in file size eventually increased and most of Exif and Meta data loss (GPS location data, camera details, etc..)

Code :

public ArrayList<Object> uploadImagesToHbase(MultipartFile uploadedFileRef){
    byte[] bytes =uploadedFileRef.getBytes();
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    ImageIO.write(image, "jpg", outputStream);
    HBaseAdmin admin = new HBaseAdmin(configuration);
    HTable table = new HTable(configuration, "sample");
    Put image = new Put(Bytes.toBytes("1"));
    image.add(Bytes.toBytes("DataColumn"), Bytes.toBytes(DataQualifier), bytes);
    table.put(image);

How to store and retrieve a Image with out any change / loss?

  • how did you store it and what is you code to store it? – Whitefret Apr 30 '16 at 08:58
  • 1
    It has nothing to do with hadoop or hbase. Image is different because you recoded it in the first three lines of code. Instead, you need to load you image from file into byte buffer without using `ImageIO`. – gudok May 01 '16 at 06:57
  • Ok Now i have modified my code as `byte[] bytes = uploadedFileRef.getBytes();` where uploadedFileRef=Multipartfile – Sharavanakumaar Murugesan May 01 '16 at 09:49
  • The code you have posted seems very bogus, with multiple variables named `image` etc... But anyway, @gudok is correct, just store the `bytes` byte array directly to your database, and you should be fine. – Harald K May 01 '16 at 17:04
  • Can you cross check your code of generating byte array with http://www.mkyong.com/java/how-to-convert-byte-to-bufferedimage-in-java/ ? – Ram Ghadiyaram May 02 '16 at 06:38
  • @RamPrasadG You are missing the point. `ImageIO` or `BufferedImage` is not needed here. He should just store the bytes to database. – Harald K May 02 '16 at 13:20
  • No I was not away from the point, Infact, I was asking him to cross check how he has created that byte array, to store in Hbase. i.e imageInByte = baos.toByteArray(); from the code which was given in the link. – Ram Ghadiyaram May 02 '16 at 13:39
  • As @gudok already stated, ImageIO is the problem in this case, not the cure. The link posted by RamPrasad G does not write meta data, so it can't possibly solve the OPs problem. If gudok wants to post an answer, I'm happy to delete mine and upvote his. – Harald K May 02 '16 at 14:47

2 Answers2

1

Most likely you are just over-complicating things. :-)

The reason why you are losing the Exif and other metadata, is that the ImageIO convenience methods ImageIO.read(...) and ImageIO.write(...) does not preserve metadata. The good news is, they are not needed.

As you seem to already have the image data from the MultipartFile, you should simply store that data (the byte array) in the database, and you will store exactly what the user uploaded. No difference in file size, and metadata will be untouched.

Your code above doesn't compile for me, and I'm no HBase expert, so I just leave that out (as you have already been able to store an image, to see the size/quality difference and metadata loss, I assume you know how to do that :-) ). But here's the basics:

public ArrayList<Object> uploadImagesToHbase(MultipartFile uploadedFileRef) {
    byte[] bytes = uploadedFileRef.getBytes();

    // Store the above "bytes" byte array in HBase *as is* (no ImageIO)
}
Harald K
  • 26,314
  • 7
  • 65
  • 111
0

Please try using SerializationUtils from Apache Commons Lang.

Below are methods

static Object   clone(Serializable object)  //Deep clone an Object using serialization.
static Object   deserialize(byte[] objectData) //Deserializes a single Object from an array of bytes.
static Object   deserialize(InputStream inputStream)  //Deserializes an Object from the specified stream.
static byte[]   serialize(Serializable obj) //Serializes an Object to a byte array for storage/serialization.
static void serialize(Serializable obj, OutputStream outputStream) //Serializes an Object to the specified stream.

While storing in to hbase you can store byte[] which was returned from serialize. While getting the Object you can type cast to corresponding object for ex: File object and can get it back.

Whitefret
  • 1,057
  • 1
  • 10
  • 21
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
  • The OP already has the `byte[] bytes` from the `MultipartFile`. He does not need to (re-)serialize anything. – Harald K May 02 '16 at 13:23
  • @haraIdK : I know that his way of serializing and converting to byte array not working. I am asking him to try another approach which is different from what he has done to verify it works or not. Seems like you misunderstood my intent – Ram Ghadiyaram May 02 '16 at 13:41