5

I'm toying with Java8's streams and CompletableFutures. My pre-existing code has a class that takes a single URL and downloads it:

public class FileDownloader implements Runnable {
    private URL target;
    public FileDownloader(String target) {
        this.target = new URL(target);
    }
    public void run() { /* do it */ }
}

Now, this class gets it's information from another part that emits List<String> (a number of targets on a single host).

I've switched the surrounding code to CompletableFuture:

public class Downloader {
    public static void main(String[] args) {
        List<String> hosts = fetchTargetHosts();
        for (String host : hosts) {
            HostDownloader worker = new HostDownloader(host);
            CompletableFuture<List<String>> future = 
                CompletableFuture.supplyAsync(worker);
            future.thenAcceptAsync((files) -> {
                for (String target : files) {
                    new FileDownloader(target).run();
                }
            });
        }
    }

    public static class HostDownloader implements Supplier<List<String>> {
        /* not shown */ 
    }
    /* My implementation should either be Runnable or Consumer.
       Please suggest based on a idiomatic approach to the main loop.
     */
    public static class FileDownloader implements Runnable, Consumer<String> { 
        private String target;
        public FileDownloader(String target) {
            this.target = target;
        }

        @Override
        public void run() { accept(this.target); }

        @Override
        public void accept(String target) {
            try (Writer output = new FileWriter("/tmp/blubb")) {
                output.write(new URL(target).getContent().toString());
            } catch (IOException e) { /* just for demo */ }
        }
    }
}

Now, this doesn't feel natural. I'm producing a stream of Strings and my FileDownloader consumes one of them at a time. Is there a readymade to enable my single value Consumer to work with Lists or am I stuck with the for loop here?

I know it's trivial to move the loop into the accept and just make a Consumer<List<String>>, that's not the point.

mabi
  • 5,279
  • 2
  • 43
  • 78

3 Answers3

4

There is no point in dissolving two directly dependent steps into two asynchronous steps. They are still dependent and if the separation has any effect, it won’t be a positive one.

You can simply use

List<String> hosts = fetchTargetHosts();
FileDownloader fileDownloader = new FileDownloader();
for(String host: hosts)
    CompletableFuture.runAsync(()->
        new HostDownloader(host).get().forEach(fileDownloader));

or, assuming that FileDownloader does not have mutable state regarding a download:

for(String host: hosts)
    CompletableFuture.runAsync(()->
        new HostDownloader(host).get().parallelStream().forEach(fileDownloader));

This still has the same level of concurrency as your original approach using supplyAsync plus thenAcceptAsync, simply because these two dependent steps can’t run concurrently anyway, so the simple solution is to put both steps into one concise operation that will be executed asynchronously.


However, at this point it’s worth noting that the entire use of CompletableFuture is not recommended for this operation. As it’s documentation states:

The problem with the common pool is that its pre-configured concurrency level depends on the number of CPU cores and won’t be adjusted if threads are blocked during an I/O operation. In other words, it is unsuitable for I/O operations.

Unlike Stream, CompletableFuture allows you to specify an Executor for the async operations, so you can configure your own Executor to be suitable for I/O operations, on the other hand, when you deal with an Executor anyway, there is no need for CompletableFuture at all, at least not for such a simple task:

List<String> hosts = fetchTargetHosts();

int concurrentHosts = 10;
int concurrentConnections = 100;
ExecutorService hostEs=Executors.newWorkStealingPool(concurrentHosts);
ExecutorService connEs=Executors.newWorkStealingPool(concurrentConnections);

FileDownloader fileDownloader = new FileDownloader();
for(String host: hosts) hostEs.execute(()-> {
    for(String target: new HostDownloader(host).get())
        connEs.execute(()->fileDownloader.accept(target));
});

At this place you may consider either, to inline the code of FileDownloader.accept into the lambda expression or to revert it to be a Runnable so that you can change the inner loop’s statement to connEs.execute(new FileDownloader(target)).

Holger
  • 285,553
  • 42
  • 434
  • 765
  • That sounds like you wouldn't advise using `CompletableFuture` with I/O operations at all? One of the simplifications for this questions was losing the `ExecutorService` I use in my real code (as well as reporting and error handling, of course), sorry, should have mentioned that. Thanks for the `WorkStealingPool`, I haven't seen that one used. – mabi May 20 '15 at 13:14
  • 1
    Well, I don’t recommend using it with I/O and its default executor. Therefore, it’s not a solution for getting rid of dealing with `ExecutorService` when performing I/O tasks. The only remaining strength is to allow to model complex dependencies (which are not present in this example). But I’m not sure whether I *want* to have such dependencies in my code and then modelling them by having tons of `CompletableFuture` instances uncontrollable linked together… – Holger May 20 '15 at 13:24
  • I like your reasoning. Since I'm after an idiomatic use of `Consumer` this seems like the most simple (and thus, in my book, the best) choice. Thanks! – mabi May 20 '15 at 13:49
1

I think you need to do forEach like that:

for (String host : hosts) {
    HostDownloader worker = new HostDownloader(host);
    CompletableFuture<List<String>> future = 
            CompletableFuture.supplyAsync(worker);
    future.thenAcceptAsync(files -> 
            files.stream()
            .forEach(target -> new FileDownloader(target).run())
    );
}

by the way, you could do the same with the main loop...

edit: Since OP edited original post, adding implementation details of FileDownloader, I am editing my answer accordingly. Java 8 functional interface is meant to allow the use of lambda expr in place of concrete Class. It is not meant to be implemented like regular interface 9although it can be) Therefor, "to take advantage of" Java 8 consumer means replacing FileDownloader with the code of accept like this:

for (String host : hosts) {
    HostDownloader worker = new HostDownloader(host);
    CompletableFuture<List<String>> future = CompletableFuture.supplyAsync(worker);
    future.thenAcceptAsync(files -> 
            files.forEach(target -> {
                try (Writer output = new FileWriter("/tmp/blubb")) {
                    output.write(new URL(target).getContent().toString());
                } catch (IOException e) { /* just for demo */ }
            })
    );
}
Sharon Ben Asher
  • 13,849
  • 5
  • 33
  • 47
  • 3
    Can also do `files.forEach(...)` directly on the list without getting a stream. – Misha May 20 '15 at 09:48
  • So the answer is "I'm stuck"? Can I somehow take advantage of the `Consumer` nature of my `FileDownloader` here? – mabi May 20 '15 at 10:17
  • 1
    @mabi: Your `FileDownloader` is not a `Consumer`. It’s a `Runnable`. To “take advantage of the `Consumer` nature” of your class, it has to have a `Consumer` nature first. – Holger May 20 '15 at 10:19
  • @Holger I've provided a bare bones "nature", thought that'd be implied by my comment. – mabi May 20 '15 at 10:36
  • 1
    @mabi: if your `FileDownloader` was a `Consumer` passing it directly to `forEach` was possible. Since it isn’t, you can’t do it, so the answers provide you with the best that is possible so far. You can’t blame the answers for not using what you didn’t provide in the question. – Holger May 20 '15 at 10:43
1

An alternative would be:

CompletableFuture.supplyAsync(worker)
                 .thenApply(list -> list.stream().map(FileDownloader::new))
                 .thenAccept(s -> s.forEach(FileDownloader::run));
assylias
  • 321,522
  • 82
  • 660
  • 783
  • 1
    It doesn’t make sense to split these steps into asynchronous operations. `list.stream().map(…)` does *nothing*; keep in mind the *lazy* nature of stream operations. The expression specified in `map` will be evaluated within the `forEach` operation. – Holger May 20 '15 at 10:10