2

There are mulitple questions for streams but for this usecase & in java, didnt find any.

I have a huge stream of objects Stream<A> [~1Million objects]. StreamA comes from a file.

Class A { enum status [Running,queued,Completed], String name }

I want to split Stream<A> into three streams without using any Collect statements. Collect statement loads everything into memory.

I am facing StackOverflowException as I am calling stream.concat multiple times here.

Stream.Concat has problem mentioned in Java Docs "Implementation Note: Use caution when constructing streams from repeated concatenation. Accessing an element of a deeply concatenated stream can result in deep call chains, or even StackOverflowException."

Map<Status, Stream<String>> splitStream = new HashMap<>();
streamA.foreach(aObj -> 
Stream<String> statusBasedStream = splitStream.getOrDefault(aObj.status,Stream.of());
splitStream.put(aObj.status, Stream.concat(statusBasedStream, Stream.of(aObj.name))); 

There are few options where custom streams are available in github to achieve Concatenation but wanted to use standard libraries to solve this.

If data is smaller would have taken a list approach as mentioned here (Split stream into substreams with N elements)

1 Answers1

1

Not the exact solution of the problem but if you have information about the indexes then combination of Stream.skip() and Stream.limit() can help in this - Below is the dummy code that I tried -

    int queuedNumbers = 100;
    int runningNumbers=200;
    Stream<Object> all = Stream.of();
    Stream<Object> queuedAndCompleted = all.skip(queuedNumbers);
    Stream<Object> queued = all.limit(queuedNumbers);
    Stream<Object> running = queuedAndCompleted.limit(runningNumbers);
    Stream<Object> completed = queuedAndCompleted.skip(runningNumbers);

Hope it would be of some help.

Arvind Kumar
  • 459
  • 5
  • 21