6

I want to catch and ignore and ArrayIndexOutOfBoundsException error (basically it's not something I have control over, so I need my program to keep chugging along).

However my try/catch pair doesn't seem to catch the exception and ignore it. Hopefully you can pick out what I am doing wrong.

The exception occurs at this line

content = extractor.getTextFromPage(page);

Here is my code:

for(int page=1;page<=noPages;page++){
    try{
        System.out.println(page);           
        content = extractor.getTextFromPage(page);
        }
    }   
    catch (ArrayIndexOutOfBoundsException e){
    System.out.println("This page  can't be read");
    }    
}

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Invalid index: 02 at com.lowagie.text.pdf.CMapAwareDocumentFont.decodeSingleCID(Unknown Source) at com.lowagie.text.pdf.CMapAwareDocumentFont.decode(Unknown Source) at com.lowagie.text.pdf.parser.PdfContentStreamProcessor.decode(Unknown Source) at com.lowagie.text.pdf.parser.PdfContentStreamProcessor.displayPdfString(Unknown Source) at com.lowagie.text.pdf.parser.PdfContentStreamProcessor$ShowText.invoke(Unknown Source) at com.lowagie.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(Unknown Source) at com.lowagie.text.pdf.parser.PdfContentStreamProcessor.processContent(Unknown Source) at com.lowagie.text.pdf.parser.PdfTextExtractor.getTextFromPage(Unknown Source) at com.pdfextractor.main.Extractor.main(Extractor.java:64)

edit: I have put the try/catch within the for loop
and added the stack trace
and removed index=1

wattostudios
  • 8,666
  • 13
  • 43
  • 57
Ankur
  • 50,282
  • 110
  • 242
  • 312
  • By the way the line 'int index = 1' is completely useless in this case unless you have omitted code inside the try block for posting purposes – AmbrosiaDevelopments Nov 18 '09 at 05:17
  • Yes that is right, I have omitted the processing but forgot about index=1, will remove it now – Ankur Nov 18 '09 at 05:18
  • you only have try inside the loop, the catch statement is outside – cobbal Nov 18 '09 at 05:21
  • Duplication of http://stackoverflow.com/questions/1753615/reading-a-pdf-document-with-itext-not-working-sometimes – AmbrosiaDevelopments Nov 18 '09 at 05:22
  • Well that was about iText and trying to figure out what was causing the exception. This is about how do we ignore the exception given that it can't be solved. – Ankur Nov 18 '09 at 05:25
  • Amazingly, iText is throwing `if (offset + len > bytes.length) throw new ArrayIndexOutOfBoundsException("Invalid index: " + offset + len);` , But still you can't catch this Exception. I tried `catch ( Exception e )` and it worked. – Rakesh Juyal Nov 18 '09 at 07:02

10 Answers10

4

Is the ArrayIndexOutOfBoundsException that you put in the catch from the same package as the one being thrown? i.e. java.lang

Or perhaps catch throwable to see if that even works.

digiarnie
  • 22,305
  • 31
  • 78
  • 126
  • That's actually a really good question, it wasn't the case, but then I imported "import java.lang.ArrayIndexOutOfBoundsException;" and I still get the error. I also have "import java.io.IOException;" I am wondering if that can cause a conflict? – Ankur Nov 18 '09 at 05:27
  • shouldn't. what happened when you try-catch (Throwable t)? – digiarnie Nov 18 '09 at 05:29
  • Sorry was in a meeting - just tried throwable and it seems to be the best solution. – Ankur Nov 18 '09 at 06:34
4

It is possible that the code that you are calling is handling the ArrayIndexOutOfBoundsException and printing the the stack trace on its own without rethrowing it. If that is the case, you would not see your System.out.println called.

EDIT: If you want to keep chugging along, it would be good to know that the PDFContentStreamProcessor#processContent will catch the ArrayIndexOutOfBoundsException and then throw an instance of its com.lowagie.text.ExceptionConverter, which is a subclass of RuntimeException.

akf
  • 38,619
  • 8
  • 86
  • 96
  • Thanks I will have to do a little reading to get my head around throwing and catching but I suspect you are right. Will check it out and report back. – Ankur Nov 18 '09 at 05:30
  • Was just about to suggest the same thing. Seems like a high possibility (even if it would be a weird thing for the other component to do) – digiarnie Nov 18 '09 at 05:31
  • You could also try and (yuck) debug code and step into the other library (assuming you have either the source code or have a de-compiler in your IDE) and see exactly what it is doing. – digiarnie Nov 18 '09 at 05:33
  • Thanks akf - my java skills are unfortunately not good enough to know how to implement what you suggest in your EDIT – Ankur Nov 18 '09 at 06:35
3

Maybe this is a no-brainer (after all, I'm running on 3 hours of sleep in the last 36 hours), but along the lines of what digiarnie and Ankur mentioned: have you tried simply catch (Exception e)?

It's definitely not ideal, since obviously it (along with the Throwable t suggestion) will catch every exception under the sun, not limited to ArrayOutOfBoundsException. Just thought idea out there if you haven't tried it yet.

Magsol
  • 4,640
  • 11
  • 46
  • 68
1

Instead of using this exception, you should fix your code so that you do not go past array boundaries!

Most arrays count from 0 up to array.length-1

If you replace your for loop with this, you might this avoids the entire issue:

for (int page = 0;page < noPages;page++){
Matt
  • 43,482
  • 6
  • 101
  • 102
  • 3
    The code causing the problem is not my own and the page numbers correspond to real page numbers so they cannot be arbitrarily changed - in short I am not causing the exception but I need to handle/ignore eit – Ankur Nov 18 '09 at 05:13
  • But do you know what page number causes the exception? 99% of the time you should be using less than (<) instead <= when looping through an array. Did you give that a try? – Matt Nov 18 '09 at 05:14
  • Yes tried it, the page number causing the problem is page=31 and there a total of 39 pages in the document. So in short the error is not to do witih this loop it is coming out of the code behind the getTextFromPage() method - since I only get it on some documents and at different places in each document it must have something to do with how that method works - it's part of the iText package. – Ankur Nov 18 '09 at 05:17
  • 1
    Yea, I see your stack trace you just added, and clearly it is coming from the PDF API you're using. Not sure exactly why it isn't being caught. You have an extra brace in your code but that should actually be causing a compile error. I guess, if desperate, you can try catching Throwable to see if you can trap *something* – Matt Nov 18 '09 at 05:22
0

you need the try/catch to be inside the forloop. control pops out to the try catch, the catch fires, and resumes control afterwards, but the forloop has already been terminated.

kolosy
  • 3,029
  • 3
  • 29
  • 48
0
    for(int page=1;page<=noPages;page++)
    {
        try
        {
            content = extractor.getTextFromPage(page); 
            System.out.println(content);
        }
        catch (ArrayIndexOutOfBoundsException e)
        {
            System.out.println("This page can't be read");
        }
    }
AmbrosiaDevelopments
  • 2,576
  • 21
  • 28
  • 1
    page=0 causes another error - there is no page=0 the first page is page=1. The page variable corresponds to real page numbers. – Ankur Nov 18 '09 at 05:15
  • is there a count method you can call on the extractor object? Where do you get noPages from? – AmbrosiaDevelopments Nov 18 '09 at 05:15
  • Yes there is other code that get's that value int noPages = reader.getNumberOfPages(); that code works fine – Ankur Nov 18 '09 at 05:19
0

Perhaps this is a silly question... Are you sure that the exception is thrown in the code you posted and not in a differen method?

TofuBeer
  • 60,850
  • 18
  • 118
  • 163
0

The program should have worked. You should give more details including your class name. You can try by catching Exception or putting a finally block with some s.o.p in it.

fastcodejava
  • 39,895
  • 28
  • 133
  • 186
0

This is strange - I actually had a look at itext's source in the method the exception is thrown from (CMapAwareDocumentFont.decodeSingleCID) and it looks like this:

 private String decodeSingleCID(byte[] bytes, int offset, int len){
        if (toUnicodeCmap != null){
            if (offset + len > bytes.length)
                throw new ArrayIndexOutOfBoundsException("Invalid index: " + offset + len);
            return toUnicodeCmap.lookup(bytes, offset, len);
        }

        if (len == 1){
            return new String(cidbyte2uni, 0xff & bytes[offset], 1);
        }

        throw new Error("Multi-byte glyphs not implemented yet");
    }

The ArrayIndexOutOfBoundsException it throws is the standard Java one. I can't see any reason your original try-catch not working.

Perhaps you should post the entire class? Also, which version of itext are you using?

JMM
  • 3,922
  • 6
  • 39
  • 46
0

Wait a second! You're missing some braces in there :) Your catch statement is outside your for statement! You have this:

for(int page=1;page<=noPages;page++){
    try{
        System.out.println(page);               
        content = extractor.getTextFromPage(page);
        }
    }   
    catch (ArrayIndexOutOfBoundsException e){
    System.out.println("This page  can't be read");
    }    
}

It should be:

for(int page=1;page<=noPages;page++) {
    try{
        System.out.println(page);               
        content = extractor.getTextFromPage(page);
    }
    catch (ArrayIndexOutOfBoundsException e){
        System.out.println("This page  can't be read");
    } 
} //end for loop  

}//This closes your method or whatever is enclosing the for loop
JMM
  • 3,922
  • 6
  • 39
  • 46