0

I would like to combine two observables: one is checking whether user is authorized to fetch data, the other one is for actual fetching the data, and I want to do the authorization parallel with fetching the data to reduce the total latency. Here is an example that is not parallel:

public static void main(String[] args) {
    long startTime = currentTimeMillis();

    Observable<Integer> result = isAuthorized().flatMap(isAuthorized -> {
        if (isAuthorized) return data();
        else throw new RuntimeException("unauthorized");
    });

    List<Integer> data = result.toList().toBlocking().single();
    System.out.println("took: " + (currentTimeMillis() - startTime) + "ms");
    System.out.println(data);
    assert data.size() == 10;
}

private static Observable<Boolean> isAuthorized() {
    return Observable.create( s -> {
        try { sleep(5000); } catch (Exception e) {} // simulate latency
        s.onNext(true);
        s.onCompleted();
    });
}

private static Observable<Integer> data() {
    return Observable.create(s -> {
        for (int i = 0; i < 10; i++) {
            try { sleep(1000); } catch (Exception e) {} // simulate long running job
            s.onNext(i);
        }
        s.onCompleted();
    });
}

The total time to execute this is 15 seconds, if the calls to authorization and data fetching was parallel it should be 10 seconds. How to do that? Ideally I would also like to how many data items at most are cached in memory while waiting for authorization to complete.

BTW, I've read excellent answer about paralleling observables, but still don't now how to solve my problem.

Community
  • 1
  • 1
Wojciech Gdela
  • 359
  • 2
  • 9

2 Answers2

0

To do this with type safety I'd suggest a wrapping immutable class for both isAuthorized() emissions and data() emissions and merge the streams, then reduce and filter to emit nothing (not authorized) or data (authorized).

static class AuthorizedData {

    final Boolean isAuthorized; //null equals unknown
    final Data data; //null equals unknown

    AuthOrData(Boolean isAuthorized, Data data) {
        this.isAuthorized = isAuthorized;
        this.data = data;
    }

}

Observable<Data> authorizedData =
  isAuthorized()
    .map(x -> new AuthorizedData(x, null))
    .subscribeOn(Schedulers.io())
    .mergeWith(
        data().map(x -> new AuthorizedData(null, x))
              .subscribeOn(Schedulers.io()))
    .takeUntil(a -> a.isAuthorized!=null && !a.isAuthorized)
    .reduce(new AuthorizedData(null, null), (a, b) -> {
        if (a.isAuthorized!=null && a.data != null)
           return a;
        else if (b.isAuthorized!=null)
           return new AuthorizedData(b.isAuthorized, a.data);
        else if (b.data!=null)
           return new AuthorizedData(a.isAuthorized, b.data);
        else 
           return a;
    })
    .filter(a -> a.isAuthorized!=null 
                 && a.isAuthorized && a.data!=null)
    .map(a -> a.data);

authorizedData above is empty if not authorized otherwise is a stream of a single Data item.

The point of takeUntil above is to unsubscribe from data() immediately it is discovered that the user is not authorized. This will be useful particularly if data() is interruptable (can close socket or whatever).

Dave Moten
  • 11,957
  • 2
  • 40
  • 47
  • updated with takeUntil addition to unsubscribe from data() if unauthorized – Dave Moten Jul 30 '15 at 08:33
  • This solution works, but only if `data()` returns single item. And my case is that it returns many items, and I have to secure all of them using `isAuthorized()` call which return single boolean that says whether user can see all data items or not. – Wojciech Gdela Jul 30 '15 at 14:17
0

I've found a way to do it:

  1. Subscribe both observables using Schedulers.io() to run them in parallel.
  2. Repeat the isAuthorized() emissions unlimited number of times using repeat(), because it normally only emits single boolean.
  3. To avoid calling authorization service for every item use cache().
  4. Zip stream of repeated boolean with stream of items, and using the boolean decide whether return item or throw exception.

Here's the solution:

public static void main(String[] args) {
  long startTime = currentTimeMillis();

  Observable<Integer> result = Observable.defer(() -> {
    Observable<Boolean> p1 = isAuthorized().cache().repeat().subscribeOn(Schedulers.io());
    Observable<Integer> p2 = data().subscribeOn(Schedulers.io());
    return Observable.zip(p1, p2, (isAuthorized, item) -> {
      if (isAuthorized)
        return item;
      else
        throw new RuntimeException("unauthorized");
    });
  });

  List<Integer> data = result.toList().toBlocking().single();
  System.out.println("took: " + (currentTimeMillis() - startTime) + "ms");
  System.out.println(data);
  assert data.size() == 10;
}

private static Observable<Boolean> isAuthorized() {
  return Observable.create( s -> {
    try { sleep(5000); } catch (Exception e) {} // simulate latency
    s.onNext(true);
    s.onCompleted();
  });
}

private static Observable<Integer> data() {
  return Observable.range(1, 10)
      .doOnNext(i -> { try { sleep(1000); } catch (Exception e) {} });
}

From what I've observed this solution also avoids OutOfMemory errors. Both observables begin to emit at the same time, but if authorization service is slower, the data items will be gathered until internal buffer is filled. Then RxJava will stop requesting data items until the authorization finally emits the boolean.

It also unsubscribes correctly from the data stream, when the authorization returns with negative outcome.

Wojciech Gdela
  • 359
  • 2
  • 9
  • Cool idea.What about handling the unauthorized exception though. Are you happy to receive onError for that? If you use a custom exception type and. .doOnErrorResumeNext you could suppress that too. – Dave Moten Jul 30 '15 at 20:22
  • We actually use exceptions to control the flow: at servlet level we subscribe to the secured observable, and if it errors with custom authorization exception we return HTTP 403 Forbidden, otherwise we serialize each data item to servlet output stream. – Wojciech Gdela Jul 30 '15 at 21:53