0

I am currently using this to extract floats:

TIFF* tiff = TIFFOpen(tiffs[i].c_str(), "r");
if (tiff) {
    uint32 width, height;
    tsize_t scanlength; 
    if (TIFFGetField(tiff,TIFFTAG_IMAGEWIDTH, &width) != 1) {}
    if (TIFFGetField(tiff,TIFFTAG_IMAGELENGTH, &height) != 1) {}
    fwidth = width;
    fheight = height;
    vector<float> data;
    scanlength = TIFFScanlineSize(tiff);
    float image[height][width];
    for (uint32 y = 0; y < height; y++) {
        TIFFReadScanline(tiff,image[y],y);
    }
}

This is taking over 0.02 seconds per TIFF and I need it to be much quicker. I know that other libraries can kind of handle this, but I have only found one other that can handle 32 bit tiffs, and it was CImg, which took way longer. Even if this is as simple as using system() to do a comand line thing or call a really fast script, I would love to know if there is a faster way.

Thank you!

https://www.dropbox.com/s/5zb8spaz7cma1gx/pic.tif?dl=0

This is an example tif.

  • 2
    20ms for image I/O doesn't sound crazy. What size are the images? Any compression in the TIFFs? – Peter Jun 14 '17 at 13:06
  • @Peter I have to convert a ton of them so it ends up being a lot :/ 512 by 512, and nope. –  Jun 14 '17 at 13:09
  • OK, so that sounds a little slow compared to disk I/O, but not much... Your 1MB file on a 100MB/s disk system would take 10ms just in raw disk read. First benchmark the file access (just read() the entire file into a buffer, being careful of the OS cache confounding your measurements), then see what overhead the library is adding. – Peter Jun 14 '17 at 13:16
  • @Peter how should I do that? sorry, i'm a total beginner :/ –  Jun 14 '17 at 19:09
  • Open and read the entire file using normal binary file reading functions. Grab the time before (using std::chrono) and the time after, subtract. Once it's working, easiest will be to reboot to be sure the filesystem cache is clean, then run the benchmark. This result will give you a lower bound on how fast you can do this processing. – Peter Jun 15 '17 at 13:52
  • @Peter it's taking less than 0.01 seconds, so the library is making it take twice as long. Do you know if possible to cut that extra time out? –  Jun 16 '17 at 17:18
  • What OS are you using? And what disk drive and what filesystem? – Mark Setchell Jun 18 '17 at 11:17
  • What are you actually trying to achieve overall - are you putting thousands of TIFFs together to make a movie or something? – Mark Setchell Jun 18 '17 at 11:29
  • @MarkSetchell RHEL 6/NFS. Nope, converting thousands of TIFFS to a viewable file format to display, they can't be viewed b/c they're 32 bit. –  Jun 19 '17 at 12:20
  • From the libtiff docs (http://www.libtiff.org/libtiff.html) it looks like the "strip" interface might be more efficient than the "scanline" interface, as it's oriented more directly to the storage format on disk. The provided examples don't output finished images... you'll have to do some work to convert those examples into your program. – Peter Jun 19 '17 at 13:36
  • Can you share a TIFF for me to test with? There is no need to write any code - you can just use ImageMagick. Make a new directory with a couple of TIFF files in (**copies NOT your originals**) and go in there and try `mogrify -format png *.tif` to convert them to PNGs. By the way, NFS is dog-slow. – Mark Setchell Jun 19 '17 at 13:39
  • Also, if you are doing thousands, you should definitely consider using **GNU Parallel** which is just a Perl script. `parallel --bar -X mogrify -format png ::: *tif` – Mark Setchell Jun 19 '17 at 14:33
  • How's it going? – Mark Setchell Jun 19 '17 at 20:29
  • @MarkSetchell thanks so much for all your responses, sorry for the late reply! The mogrify thing unfortunately didn't work, it created two images, one black, and one with the correct shape of the image, but the shape is completely white when it should have a gradient. I can't unfortunately upload a full image, but I'll work on getting another one to you. Thanks for your patience! –  Jun 20 '17 at 12:38
  • @MarkSetchell image is up. –  Jun 20 '17 at 13:34
  • Try my Tiff loader https://github.com/MalcolmMcLean/tiffloader. It might be faster than your current libraries. – Malcolm McLean Jun 20 '17 at 16:01
  • Does your code to read the file actually work? It looks your `image` is declared with one float per pixel whereas you actually have 3? – Mark Setchell Jun 22 '17 at 15:24
  • @MarkSetchell it actually does work...wow, I didn't pick that up but I'll look into why! –  Jun 23 '17 at 12:58
  • @MalcolmMcLean thanks, I'll look into it! Should I just need to download and #include? –  Jun 23 '17 at 13:00
  • It's all in one C source file, with no dependencies other than the standard library. Just drop in and compile, and put a prototype for the function somewhere. – Malcolm McLean Jun 23 '17 at 13:02
  • @MalcolmMcLean I just added it to my directory (I'm using cmake) and included loadtiff.c and tried to compile and got a ton of errors, starting with `loadtiff.c:640: error: invalid conversion from ‘void*’ to ‘long unsigned int*’`. Does it require C++ 11 or something?? –  Jun 23 '17 at 13:08
  • You're compiling as C++. It needs to be compiled as C. (Make the header extern "C") – Malcolm McLean Jun 23 '17 at 13:09
  • @MalcolmMcLean i have the library compiled and linked but get the error that floadtiff can't be found? –  Jun 23 '17 at 14:24
  • You probably need to wrap the header in extern "C" extern "C" { #include "loadtiff.h" }; – Malcolm McLean Jun 23 '17 at 14:55

1 Answers1

0

Mmmm, not as complete an answer as I was hoping to be able to provide, but I do have some thoughts to share. Maybe they will trigger some further thoughts from me or other folks...


Firstly, your images are lacking TIFF tag 262 ("Photometric Interpretation") which is upsetting several tools you might otherwise use. What program generated the images - as they are not strictly compliant? Can you correct/improve the program that generated the images?

I managed to set the "Photometric Interpretation" tag to "min-is-black" with:

tiffset -s 262 0 YourImage.tif

Once that is set, I managed to use vips (from here) - which is exceedingly fast, and memory-efficient, to convert your file to JPEG. It has Ruby and Python bindings if you prefer those languages.

So, the command-line in Terminal to convert your file to JPEG is:

vips im_vips2jpeg YourFile.tif result.jpg

enter image description here

I am not convinced that works correctly though, so maybe John @user894763 (the author of vips) would take a look.


Another thought, using vips is that the following command will save a raw RGB file of 3 floats per pixel which you can read straight into your own program without any decoding at all:

vips rawsave YourFile.tif image.raw

-rw-r--r--   1 mark  staff  3145728 20 Jun 16:59 image.raw

You'll note that the file size (3145728) corresponds to:

512 pixels * 512 pixels * 3 RGB values * 4 bytes of float each

I also used ImageMagick to convert your image to JPEG, with

convert YourImage.tif result.jpg

and got this result:

enter image description here


A further thought that occurred to me was that you could pre-warm your buffer cache before running your own TIFF extract program, by running cat on each of your files to cause them to be fetched from the NFS server:

cat *.tif > /dev/null

or maybe run parallel streams of that to reduce latency.


Another thought was that you could pre-fetch the files to a RAM-backed filesystem so that your files can be read with minimal latency. At 186kB per file, you could get 5,000 in a 1GB RAMdisk for much faster processing:

mkdir /tmp/RAM
sudo mount -t temps -o size=1G temps /tmp/RAM

You could also put intermediate files that I suggest in my thoughts above into the RAM filesystem.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Thanks, I"ll look into trying these and timing them! Quick question, I started running my program on a local drive as opposed to NFS, and it took the same amount of time. Do you have any clue as to why? –  Jun 21 '17 at 18:28
  • 1
    It's extremely hard to guess without knowing your entire environment. – Mark Setchell Jun 21 '17 at 21:26
  • I believe this is related to your other question, so this answer may be superseded by the following... https://stackoverflow.com/a/44756292/2836621 – Mark Setchell Jun 26 '17 at 09:10