I'm trying to create a Java program that will OCR many formats of images. Images cannot be read directly from file, because their bytes are to be send through network.
I'm currently able to read raw bytes of image pixels using ImageIO. However I would like to support all the formats that are supported by ImageMagick, so read the image using JMagick and then give raw bytes to Tess4J. I'm not sure how I should approach this. I found this function can give me bytes:
PixelPacket[] MagickImage.getColormap();
But I would have to write special method for transforming obtained the PixelPacket objects to consecutive bytes. I can do that, but maybe there's better way to do this? For example maybe there's some extremely raw file format (even more than http://en.wikipedia.org/wiki/BMP_file_format#mediaviewer/File:BMPfileFormat.png) that I could use for example in this method:
byte[] imageToBlob(ImageInfo imageInfo) ?
The imageInfo
object will have to point to this raw format and then I can cut out the pixels information from the bytes
array.
Is this the proper way or I should use something simpler (faster/more robust)?
Edit
I found the format I had in mind is called PNM.