0

I'm get an out of memory exception when TessNet2 reads my bitmap. It happens specifically at the tessocr.GetThresholdedImage(bmp, System.Drawing.Rectangle.Empty).Save("c:\\temp\\" + Guid.NewGuid().ToString() + ".bmp"); line.

This doesn't happen all the time, it seems to happen only when I've run the program a few times in debug mode (I haven't tried packaging the code into an exe yet). This is a console application.

I've read about using bmp.UnlockBits(bmpData) but when I put that code in; I get a Bitmap region is already locked error when it hits the tessocr.GetThresholdedImage(bmp, System.Drawing.Rectangle.Empty).Save("c:\\temp\\" + Guid.NewGuid().ToString() + ".bmp"); line.

for (int p = 0; p < pdfFiles.Count(); p++)
{
    images.Read(@"c:\temp\pdfs\" + pdfFiles[p].Name, settings);

    int pageNumber = 1;
    string pdfName = pdfFiles[p].Name;

    //__loop through each page of pdfFile
    foreach (MagickImage image in images)
    {                                   
        using (Bitmap bmp = image.ToBitmap())
        {                                                                    
            Console.WriteLine("PDF Filename: " + pdfName);
            Console.WriteLine("Page Number: " + pageNumber + " of " + images.Count);

            tessnet2.Tesseract tessocr = new tessnet2.Tesseract();
            //TODO change folder to startup Path
            tessocr.Init(@"C:\Users\Matt Taylor\Documents\Visual Studio 2012\Projects\TessNet2\TessNet2\bin\Debug\tessdata", "eng", false);

            tessocr.GetThresholdedImage(bmp, System.Drawing.Rectangle.Empty).Save("c:\\temp\\" + Guid.NewGuid().ToString() + ".bmp");
            //Tessdata directory must be in the directory than this exe
            Console.WriteLine("Multithread version");

            ocr.DoOCRMultiThred(bmp, "eng");
            //Console.WriteLine("Normal version");
            //ocr.DoOCRNormal(bmp, "eng");    
            //bmp.UnlockBits(bmp);
            bmp.Dispose();

            pageNumber++;
        }
    } 
}

Eventually once I try to run the code a few times after this error occurs, it will start throwing the error at the using (Bitmap bmp = image.ToBitmap()) line.

If I wait about 5 or 10 minutes, both of these errors go away.

ctrucza
  • 172
  • 8
MaylorTaylor
  • 4,671
  • 16
  • 47
  • 76
  • How and where is `image` defined in your code.. ? – MethodMan Jul 26 '13 at 16:25
  • I made an edit to show you in the code – MaylorTaylor Jul 26 '13 at 16:32
  • Just an idea: DoOCRMultiThread probably starts a new thread and starts OCR-ing the bmp. While it does its magic, the bitmap will sit in memory, not getting released. It might be the case that you are really running out of memory. Do you get out of memory errors if you do the DoOCRNormal()? – ctrucza Jul 26 '13 at 16:49

1 Answers1

1

As a first step I would wrap the tessnet code in a using statement

using(tessnet2.Tesseract tessocr = new tessnet2.Tesseract())
{
   tessocr.Init(...);
}

You also don't need to call bmp.Dispose() since that is also in a using statement.