1

I have:

List<Optional<MyObject>> myList;

This list is populated by reading in from files. Once the file read is finished, I do a check against empty list: if(myList.size() == 0){//} and then proceed as follows:

myList.stream()
    .filter(Optional::isPresent)
    .map(i → i.orElse(new MyObject(“adventureBook”, 20))
    .collect(groupingBy(MyObject::getBookType, TreeMap::new, mapping(MyObject::getBookPrice, toList())));

I have around 350k MyObject files to read in from, 300k files are read in fine but when I try to read in the entire batch of c.350k files it throws a null pointer exception on the collect().

How is it possible that despite wrapping in Optional<> and checking Optional::isPresent, Optional::orElse etc still a null object manages to sneak through and given that I have such a large number of files what is the best way to try and narrow down on the errant file(s)? Thanks

edit: stack trace added

Exception in thread "main" java.lang.NullPointerException
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at com.mypackage.MyObject.main(MyObject.java:108)
shanlodh
  • 1,015
  • 2
  • 11
  • 30
  • 3
    Can you please post the stack trace? – xingbin Aug 25 '18 at 13:57
  • 4
    Having a list of optionals is already an odd thing, the non-existent objects shouldn't have ended up in the list in the first place. – Mark Rotteveel Aug 25 '18 at 14:02
  • and even if there could be one `MyObject` that is null, collecting those `20`(s) to a `List` is well rather an interesting choice :) what if there is a real book (`adventureBook `) with price `20`? How will u make a difference between the two? – Eugene Aug 25 '18 at 14:05
  • And `Optional` doesn't magically protect you against `NullPointerException` if you - for example - add `null` to that list instead of `Optional.empty()`. – Mark Rotteveel Aug 25 '18 at 14:05

2 Answers2

2

Well may be one of the properties in the non-null MyObject is null? You are indeed doing filter(Optional::isPresent), but that does not mean that the fields themselves are not null. MyObject::getBookType or MyObject::getBookPrice can easily be still null.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Thanks to all who’ve responded. Taking the comments into consideration, I put in further null check before the file read arguments were passed to the MyObject constructor but problem still persists, the stack trace is too long for comments, I'm editing it to the OP. – shanlodh Aug 25 '18 at 14:26
  • @shanlodh what is on line 108 from `MyObject`? – Eugene Aug 25 '18 at 14:33
  • it's the .collect() mentioned in the OP – shanlodh Aug 25 '18 at 14:59
  • the 350k files I'd mentioned are distributed across c.15k folders and it appears that there were some non .gz2 files in a few folders that were upsetting the org.apache.commons.compress.compressors.* deserializers I was using. Using a helper class implementing FileNameFilter solved the issue. Thanks all once again – shanlodh Aug 25 '18 at 17:50
  • @shanlodh you do understand that this is literally unrelated with the question itself, right? good think you solved it, my next time provide all the details needed – Eugene Aug 25 '18 at 18:10
  • it shows how NPE can arise for various reasons and gives future readers of this thread an additional source of error to consider – shanlodh Aug 26 '18 at 06:12
-1

Regarding the question "what is the best way to try and narrow down on the errant file(s)?", I usually use a pattern like this when parsing:

for (File file : filesToParse) {
    try {
        parseFile(file);
    } catch (IOException e) {
        // rethrow the exception, and make sure that
        // 1. you give the file in the message, to help narrow the error down
        // 2. pass the original exception as 2nd parameter, to preserve the stack trace
        throw new IOException("Failed to parse file: " + file, e);
    }
}
user3151902
  • 3,154
  • 1
  • 19
  • 32
  • You should not catch all runtime exceptions, only the one you are interested in, like `IOException`. Catching everything should only be done as *super wrapper* around your `main` app, and only for logging. – Zabuzard Aug 25 '18 at 14:35
  • My understanding is that specific runtime exceptions should not be caught anyways, so I don't see a problem in throwing a different runtime exception, with more details to help debugging. See https://stackoverflow.com/questions/45607947 for example. `IOException` is not a runtime exception, so would need to be handled differently, if thrown. – user3151902 Aug 29 '18 at 10:23
  • @Zabuza What would be your suggested solution? – user3151902 Sep 12 '18 at 16:56
  • Only catch `IOException`. Let the rest through. And if you truly want to catch everything, you need to include the errors too. So `Throwable` instead of `RuntimeException`. Though its questionable whether one should piggyback a `RuntimeException` such as `IllegalStateException` on an error. – Zabuzard Sep 13 '18 at 11:52
  • 1
    @Zabuza Like this? – user3151902 Sep 24 '18 at 10:43
  • Yes, awesome :) – Zabuzard Sep 24 '18 at 10:53