2

Currently I'm developing a merge between two sorted collections of elements of type T (the type is not important as long you provide a means to compare the Type, for example, in Java, A Comparator<T> will do the work).

What I don't want is to necessarily merge both data structures involved in the merge process (I don't want to get an entire new structure holding both elements merged). What I want is to have some kind of observer of the merge process in order to define what to do with each merged element in another class. For example, a would like to have something like this:

merger.merge(leftCollection,rightCollection,theComparator,theObserver).

Where the observer is a object watching the merge algorithm and gets notified of the actions, i mean :

interface MergeObserver<T> {

            /**
             * Triggered when the merge algorithm decides to merge only the left entry.
             * This case correspond to the case when there is no equivalent entry on the right collection.
             */
            public void mergeLeft(T entry);

            /**
             * Triggered when the merge algorithm decides to merge both entries.
             * This case correspond to the case when there exists the same entry on both collections.
             */
            public void mergeBoth(T left, T right);

            /**
             * Triggered when the merge algorithm decides to merge only the right entry.
             * This case correspond to the case when there is no equivalent entry on the left collection.
             */
            public void mergeRight(T entry);
        }

I have already make my implementation for sorted collections, but... I would like to share this feeling, and here comes the question, about if someone has thought of this before, specially in Guava Libraries, and what are the proper terminology employed.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Victor
  • 3,841
  • 2
  • 37
  • 63
  • I suspect that this is too special purpose for a general purpose library, because the MergeObserver abstraction is only useful when merging (or otherwise joining) two datasets, which in enterprise computing is commonly done by the database, and rarely done otherwise. – meriton Sep 06 '15 at 12:01
  • Are you using the comparator to always compare a value from `left` with a value from `right`? If this is the case, wrap the given comparator with your own comparator, which simply delegates the comparison and then notifies the observers depending on the result of the comparison (`<0` for `mergeLeft`, `==0` for `mergeBoth` and `>0` for `mergeRight`). – fps Sep 06 '15 at 13:45

2 Answers2

2

The two most commonly used patterns for separating the traversal of a data structure and the processing of the data are the visitor pattern and the iterator pattern. Both of those patterns can be applied not only to real data structures that are present in memory but also to "virtual" data structures (which is probably not the proper term). e.g. the method List.subList in the Java API creates a view of a part of the list. So the List object returned by it is just a reference to part of another lists data. Of course you can also combine data structures. You could for example have a method that takes as arguments two iterators and returns a new iterator that merges the two without using any additional memory because that merged list is not actually present in RAM. If you used Scala instead of Java you would have lots of methods available that can transform iterators in many different ways to achieve effects like this.

import java.util.function.Predicate;
import java.util.function.BiPredicate;
import java.util.function.Supplier;
import java.util.function.Consumer;
import java.util.stream.IntStream;
import java.util.NoSuchElementException;

interface MyIterator<T> extends Iterator<T> {
  class Peekable<T> {
    private final MyIterator<T> iter;
    private T next = null;
    private boolean isNextBuffered = false;
    private boolean atEnd = false;

    private Peekable(MyIterator<T> iter) {
      this.iter = iter;
    }

    private void advance() {
      if(atEnd) throw new NoSuchElementException();
      if(iter.hasNext()) {
        next = iter.next();
        isNextBuffered = true;
      } else {
        atEnd = true;
      }
    }
    private boolean hasNext() {
      if(atEnd) return false;
      if(!isNextBuffered) advance();
      return !atEnd;
    }
    private T next() {
      T next = peek();
      advance();
      return next;
    }
    private T peek() {
      if(hasNext()) return next;
      throw new NoSuchElementException();
    }
  }

  static <T> MyIterator<T> of(BooleanSupplier hasNext, Supplier<T> next) {
    return new MyIterator<T>() {
      public boolean hasNext() {
        return hasNext.getAsBoolean();
      }
      public T next() {
        return next.get();
      }
    };
  }

  static <T> MyIterator<T> of(Iterator<T> iter) {
    return of(iter::hasNext, iter::next);
  }

  static MyIterator<Integer> range(int start, int end) {
    int[] value = {start};
    return of(() -> value[0] < end, () -> value[0]++);
  }

  default <R> MyIterator<R> map(Function<? super T,? extends R> mapper) {
    return of(this::hasNext, () -> mapper.apply(this.next()));
  }

  default MyIterator<T> filter(Predicate<? super T> predicate) {
    Peekable<T> iter = new Peekable<T>(this);

    return new MyIterator<T>() {
      public boolean hasNext() {
        while(iter.hasNext() && !predicate.test(iter.peek())) iter.advance();
        return iter.hasNext();
      }
      public T next() {
        hasNext();
        return iter.next();
      }
    };
  }

  default MyIterator<T> merge(MyIterator<T> other, BiPredicate<? super T,? super T> smallerEqual) {
    Peekable<T> iter1 = new Peekable<T>(this);
    Peekable<T> iter2 = new Peekable<T>(other);

    return of(() -> iter1.hasNext() || iter2.hasNext(),
              () -> {
                if(!iter1.hasNext()) return iter2.next();
                else if(!iter2.hasNext()) return iter1.next();
                else {
                  T elem1 = iter1.peek();
                  T elem2 = iter2.peek();
                  return smallerEqual.test(elem1, elem2) ? iter1.next() : iter2.next();
                }
              });
  }
}

interface MyIterable<T> extends Iterable<T> {
  default Iterator<T> iterator() {
    return myIterator();
  }

  MyIterator<T> myIterator();

  static <T> MyIterable<T> of(Supplier<MyIterator<T>> myIterator) {
    return new MyIterable<T>() {
      public MyIterator<T> myIterator() {
        return myIterator.get();
      }
    };
  }

  static <T> MyIterable<T> of(Iterable<T> iterable) {
    return of(() -> MyIterator.of(iterable.iterator()));
  }

  static MyIterable<Integer> range(int start, int end) {
    return of(() -> MyIterator.range(start, end));
  }

  default <R> MyIterable<R> map(Function<? super T,? extends R> mapper) {
    return of(() -> this.myIterator().map(mapper));
  }

  default MyIterable<T> filter(Predicate<? super T> predicate) {
    return of(() -> this.myIterator().filter(predicate));
  }

  default MyIterable<T> merge(MyIterable<T> other, BiPredicate<? super T,? super T> smallerEqual) {
    return of(() -> this.myIterator().merge(other.myIterator(), smallerEqual));
  }
}


public class Test {
  public static void main(String[] args) {
    MyIterable<Integer> iterable = MyIterable.range(0, 10);

    MyIterable<Integer> iter1 = iterable.map(x -> 2 * x).filter(x -> x < 10);
    MyIterable<Integer> iter2 = iterable.map(x -> 2 * x + 1).filter(x -> x < 10);
    MyIterable<Integer> iterMerged = iter1.merge(iter2, (x, y) -> x <= y);

    iter1.forEach(System.out::println);
    System.out.println();
    iter2.forEach(System.out::println);
    System.out.println();
    iterMerged.forEach(System.out::println);
  }
}
Tesseract
  • 8,049
  • 2
  • 20
  • 37
  • very interesting words, thanks. Yup, the observer can be viewed as a kind of iterator with an inversed control flow, i mean, with a callback (instead of a direct call like next(), so you get called each time the algorithm defines who will be the next element); so this leads perhaps to the idea that the concrete types of observers can be viewed more like Visitors i guess. The particular case of the merge algorithm is the case where both elements are equals. It would be a very interesting excursive to invert the control flow.... Scala? I'm already married with JAVA ;P. Thanks spider – Victor Sep 06 '15 at 00:26
  • By using the new features introduced in Java 8 you can do much of the stuff that Scala has in Java as well. I added an example that shows how you can define your own iterables that can be merged, filtered, etc. without having to allocate much memory since no actual data structures are created. Look at the example at the bottom of the code first. It creates two iterables, one containing the numbers (0, 2, 4, 6, 8) and one with (1, 3, 5, 7, 9) and then both are merged into one iterable. – Tesseract Sep 06 '15 at 11:39
  • whoa....that is nice example.. i lack of some language elements compression introduced with JAVA 8 in order to fully understand the code, so i will take some until i processed it... still i get the idea of the application program at the bottom, i found very interesting (a good reason to learn java 8).. still, i am very interesting in having control when two equals elements ,according to a comparator, are found in both collections, because in some cases i need to proceed both, and in other cases only one. – Victor Sep 07 '15 at 04:22
1

What would probably be more idiomatically "java" is to write your merger with a listener:

public interface Merger {
    public Collection<T> merge(Collection<T> left, Collection<T> right, Comparator comparator);
    public void addListener(Observer observer);
    public void notifyListener(Message message);
}

public interface Observer {
    public void notify(Message message);
}
dwoz
  • 156
  • 6
  • Mmmm.. good consideration @dwoz. Thanks for the contribution... so you agree with the terminology, you only propose a way to add many observers... is fair, not necessary on my particular case. You know, i can archive that although i can say that the actual implementation can be change to `merger.merge(Collection left, Collection right, Comparator super T> theComparator,Collection> theObsersers)`. – Victor Sep 05 '15 at 16:38
  • @dwoz: Why would this be more idiomatic? – meriton Sep 06 '15 at 11:53
  • @meriton, my main reason for saying it is that the listener/observer pattern is sort of turned upside down in the OP's design. In his design the observed object has to have awareness of the internals of the observer, and that goes against the idea of what an observer pattern is about. Probably it's more apt to say that his implementation is an "object oriented anti-pattern" rather than to speak to its "java-ness." – dwoz Sep 06 '15 at 17:04