0

With a Set in Ceylon it is straightforward to determine if one collection is a superset of the other. It's just first.superset(second). What's the best way to do the equivalent for an Iterable, List, or Sequential using multiset (or bag) semantics? For example something like the pseudocode below:

{'a', 'b', 'b', 'c'}.containsAll({'b', 'a'}) // Should be true
{'a', 'b', 'b', 'c'}.containsAll({'a', 'a'}) // Should be false
drhagen
  • 8,331
  • 8
  • 53
  • 82

3 Answers3

4

There is Category.containsEvery, which is inherited by Iterable. It checks for each element of the parameter whether it is contained in the receiver, so that bigger.containsEvery(smaller) is equivalent to this:

smaller.every(bigger.contains)

(Note that it is swapped around.) The expression in the brackets here is a method reference, we could also write this expanded with a lambda:

smaller.every(o => bigger.contains(o))

So in your example:

print({'a', 'b', 'b'}.containsEvery({'b', 'a'})); // Should be true
print({'a', 'b', 'b'}.containsEvery({'a', 'a'})); // Should be false

... actually, those both return true. Why do you think the latter one is false?

Did you think of multiset semantics (i.e. the number of occurrences in the "superset" iterable need to be at least as much as the smaller one)? Or do you want a sublist? Or do you just want to know whether the second iterable is at the start of the first (startswith)?

I don't know about any multiset implementation for Ceylon (I found a multimap, though). If you are running on the JVM, you can use any Java one, like from Guava (though that also doesn't have a "contains all with multiples" function, as far as I can see). For small iterables, you can use .frequencies() and then compare the numbers:

Boolean isSuperMultiset<Element>({Element*} bigger,
                                 {Element*} smaller) =>
     let (bigFreq = bigger.frequencies())
        every({ for(key->count in smaller.frequencies())
                count <= (bigFreq[key] else 0) })

For sublist semantics, the SearchableList interface has the includes method, which checks whether another list is a sublist. (It is not implemented by many classes, though, you would need to convert your first iterable into an Array, assuming it is not a String/StringBuilder.)

For startsWith semantics, you could convert both to lists and use then List.startsWith. There should be a more efficient way of doing that (you just could go through both iterators in parallel).

There is corresponding, but it just stops after the shorter one ends (i.e. it answers the question "does any of those two iterables start with the other", without telling which one is the longer one). Same for a bunch of other pair related functions in ceylon.language.

If you know the length of both of the Iterables (or are confident that .size is fast), that should solve the issue:

Boolean startsWith<Element>({Element*}longer, {Element*}shorter) =>
     shorter.size <= longer.size &&
     corresponding(longer, shorter);
Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
  • "Did you think of multiset semantics..." Yes, this is what I was going for. I updated the examples to be clearer. – drhagen Jan 25 '18 at 21:41
  • 1
    `frequencies()` discards nulls. If that's not what you want, you should map each element in both streams to a `Singleton` sequence before calling `frequencies()`. – gdejohn Jan 26 '18 at 04:09
1

If you have two Sequentials, then you can remove each right-hand character one at a time from the left-hand sequence until you either remove them all or fail to remove one of them.

Boolean containsAll<Element>([Element*] collection, [Element*] other)
        given Element satisfies Object {
    variable value remaining = collection;

    for (element1 in other) {
        value position = remaining.locate((element2) => element1 == element2);
        if (exists position) {
            remaining = remaining.initial(position.key).append(remaining.spanFrom(position.key + 1));
        } else {
            // Element was not found in remaining; terminate early
            return false;
        }
    }

    // All elements were found
    return true;
}

print(containsAll(['a', 'b', 'b', 'c'], ['a', 'b']));
print(containsAll(['a', 'b', 'b', 'c'], ['a', 'a']));

Append only exists on Sequential so it won't work on just a List or an Iterable.

drhagen
  • 8,331
  • 8
  • 53
  • 82
0

The containsEvery function should do what you want (try it!). Alternatively, you can also turn both streams into sets using the set function (try it!), or use every and contains (try it!).

Lucas Werkmeister
  • 2,584
  • 1
  • 17
  • 31
  • My examples were not good. Hopefully, the new ones are clearer. I want to test if the stream on the left has all of the elements of the stream on the right, including at least as many of each element. – drhagen Jan 25 '18 at 21:22