2

I want to convert a PDF to images. I am using Leadtools and to increase the speed, I am using multi-threading in the following way.

string multiPagePDF = @"Manual.pdf";
string destFileName = @"output\Manual";
Task.Factory.StartNew(() =>
{
    using (RasterCodecs codecs = new RasterCodecs())
    {
        CodecsImageInfo info = codecs.GetInformation(multiPagePDF, true);
        ParallelOptions po = new ParallelOptions();
        po.MaxDegreeOfParallelism = 5;
        Parallel.For(1, multiPagePDF.TotalPages+1, po, i =>
        {
            RasterImage image = codecs.Load(multiPagePDF, i);
            codecs.Save(image, destFileName + i + ".png", RasterImageFormat.Png, 0);
         });  
    }       
});

Is this a thread-safe manner? Will it result in unexpected output? I tried this a few times and there were instances when a specific page appeared twice in the output images.

Solution

According to Leadtools online chat support (which is very helpful btw), Rastercodecs.load is NOT thread safe and the above code would result in unexpected output (in my case, Page 1 occurred twice in the output set of images). The solution is to define codecs variable within the Parallel.For so that each iteration separately accesses its own RasterCodecs.

amyn
  • 922
  • 11
  • 24

1 Answers1

3

Amyn, As you found out, the correct way to use the RasterCodecs object in this case is this:

Task.Factory.StartNew(() =>
{
   using (RasterCodecs codecs = new RasterCodecs())
   {
      CodecsImageInfo info = codecs.GetInformation(multiPagePDF, true);
      ParallelOptions po = new ParallelOptions();
      po.MaxDegreeOfParallelism = 5;
      Parallel.For(1, info.TotalPages + 1, po, i =>
      {
         using(RasterCodecs codecs2 = new RasterCodecs()) {
           RasterImage image = codecs2.Load(multiPagePDF, i);
           codecs2.Save(image, destFileName + i + ".png", RasterImageFormat.Png, 0);
         }
      });
   }
});

This gives you the same speed benefits when running on a multi-core processor without causing any conflicts between concurrent threads.

The LEADTOOLS RasterCodecs.Load() and RasterCodecs.Save() methods are thread-safe. The reason behind creating multiple instances of the RasterCodecs class is because this class internally uses structures that hold many different loading & saving options for files. Using these structures (where these options are changed) across multiple threads can cause unpredictable results. One such property in the loading options structure is the page number. For this reason, using separate instances of this class is recommended.

escape-llc
  • 1,295
  • 1
  • 12
  • 25
LEADTOOLS Support
  • 2,755
  • 1
  • 12
  • 12
  • but doesn't that mean that .Load() and .Save() are NOT thread safe? If it were thread safe, using the same RasterCodecs across multiple threads shouldn't have been a problem. Or may be my definition for thread-safe is wrong? :/ – amyn Sep 22 '14 at 04:47
  • 1
    See the accepted answer for [**this**.](http://stackoverflow.com/questions/1999122) To claim thread-safety, we must define intended threading scenarios and also the object's correct behavior in each scenario. Using the same codecs object to access different pages simultaneously can cause page mix-up. It doesn't crash, but it doesn't work properly because the scenario is incorrect. It's like deleting items from the same list in 2 different threads. Even if you check it's not empty, you will hit an error when one thread deletes the last item after the other thread saw it still has one item. – LEADTOOLS Support Sep 25 '14 at 19:42