How can I create a bunch of images overlaying two images quickly?

Question

I'm trying with ImageMagick GraphicsMagick and CImg but it takes 10 minutes to complete 10,000 images.

I need to do this in 3 minutes.

How I could do this task faster?

This is my code:

#define cimg_use_png
#define cimg_display 0
#include "CImg.h"
#include <iostream>
#include <vector>

using namespace cimg_library;
int main() {
for (int i = 0; i < 10000; ++i){
   CImg<unsigned char> gradient("gradient.png");
   CImg<unsigned char> overlay("overlay.png");
   gradient.draw_image(80,150,overlay);
   gradient.save_png("result.png");

}
}

How I could do this faster using other libraries or another language? I have a 2 Ghz processor and 4 GB of RAM.

Please Help Me

Try using multiple threads. One thread reads the images into a buffer. Thread 2 performs the overlays to a new buffer. Thread 3 writes the overlay images to the file. Adjust buffer sizes to optimize for speed. — Thomas Matthews, Jul 18 '18 at 00:28
Based on the docs (http://cimg.eu/CImg_reference.pdf), it appears that CImg can employ OpenMP, a parallel programming library. Might want to look into that. — Alex Johnson, Jul 18 '18 at 00:28
Save your gradient to MPC (memory mapped format) once. Then reading back for use in overlaying the same gradient will be much faster. See http://www.imagemagick.org/Usage/files/#mpc — fmw42, Jul 18 '18 at 00:52

score 2 · Accepted Answer · answered Jul 18 '18 at 12:47

The slowest part is the decoding & encoding of the PNG i/o operations. If we can move the read operations outside of the for loop, and clone the image data with the copy constructor, then we should be able to improve the speed under 3 minutes.

#define cimg_use_png
#define cimg_display 0
#include "CImg.h"
#include <iostream>
#include <vector>

using namespace cimg_library;

int main() {
    CImg<unsigned char> parent_gradient("gradient.png");
    CImg<unsigned char> parent_overlay("overlay.png");
    for (int i = 0; i < 10000; ++i){
        CImg<unsigned char> gradient(parent_gradient);
        CImg<unsigned char> overlay(parent_overlay);
        gradient.draw_image(80,150,overlay);
        gradient.save_png("result.png");
    }
}

We can go farther by implementing OpenMP to distribute the work load across the CPU threads. On my mac, this drops the 10000 iterations under 20 secs.

#define cimg_use_png
#define cimg_display 0
#include "CImg.h"
#include <iostream>
#include <vector>

using namespace cimg_library;

int main() {
    CImg<unsigned char> parent_gradient("gradient.png");
    CImg<unsigned char> parent_overlay("overlay.png");
    #pragma omp parallel for
    for (int i = 0; i < 10000; ++i){
        CImg<unsigned char> gradient(parent_gradient);
        CImg<unsigned char> overlay(parent_overlay);
        gradient.draw_image(80,150,overlay);
        gradient.save_png("result.png");
    }
}

How can I create a bunch of images overlaying two images quickly?

1 Answers1