16

I have a partially ordered set, say A = [x1, x2, ...], meaning that for each xi and xj in the set, (exactly) one of four possibilities is true: xi < xj, xi == xj, xi > xj, or xi and xj are incomparable.

I want to find the maximal elements (i.e., those elements xi for which there are no elements xj with xi < xj). What is an efficient algorithm to do this (minimize the number of comparisons)? I tried building a DAG and doing a topological sort, but just building the graph requires O(n^2) comparisons, which is too many.

I'm doing this in Python, but if you don't know it I can read other languages, or pseudocode.

vallentin
  • 23,478
  • 6
  • 59
  • 81
asmeurer
  • 86,894
  • 26
  • 169
  • 240
  • To be a [partially ordered set](http://en.wikipedia.org/wiki/Partially_ordered_set) shouldn't xi > xj be disallowed? The idea is that any two elements are in order if they are comparable. – Russell Zahniser Feb 04 '14 at 18:41
  • @Vallentin read the question. I tried building a graph and doing a topological sort. – asmeurer Feb 04 '14 at 18:41
  • @RussellZahniser I'm not sure I understand your comment, but of course `x < y` is allowed. Otherwise you would have no order at all, just a set of incomparable elements. – asmeurer Feb 04 '14 at 18:43
  • I understand that your poset is given as a list of the element and the comparison function returning say -1, 0, 1 or None. Right ? – hivert Feb 04 '14 at 18:46
  • Sure. The comparison function is actually just literally `<` or `>` and so on (this is in Python; I'm using `__lt__` and so on), and raising TypeError on not comparable, but that hardly matters. I am interested in the algorithm more than the implementation. Even a sketch in words of something that works would be fine. – asmeurer Feb 04 '14 at 18:48
  • 1
    I suspect, but can't formally prove yet, that there's an information-theoretic lower bound of O(n^2) because I think there are around 2^(n^2) total possible partial orders and each comparison only lets you eliminate about 50% of them at a time. This is related to the sorting lower bound proof: since there are around 2^(n log n) possible orderings of the elements, n log n comparisons are required to sort. – templatetypedef Feb 04 '14 at 18:52
  • Can you please share your code? it might be very helpful to others. – Eran Aug 01 '17 at 09:16

5 Answers5

8

It seems the worst case is O(n^2) no matter what you do. For example, if no elements are comparable, then you need to compare every element to every other element in order to determine that they are all maximal.

And if you allow O(n^2), since the ordering is transitive, you can just make one pass through the set, keeping a list of all elements that are maximal so far; each new element knocks out any maximal elements that are < it and gets added to the maximal list if it is not < any maximal element.

Russell Zahniser
  • 16,188
  • 39
  • 30
  • Maybe that algorithm is actually fine, and I just over thought things with the DAG. I expect there to be very few maximal elements (99% of the time, there should be only one). In fact, if the maximal element set gets too large I have to start thinking of new orderings I can put on the set to make it small again. – asmeurer Feb 04 '14 at 19:02
  • 1
    There's a lot more to the efficiency of an algorithm than the worst case complexity. For instance, comparing algorithms for sorting totally ordered sets, both bubblesort and quicksort have O(n^2) _worst case_ complexity. However, quicksort has a much better _average_ complexity. On this basis, are you confident that your algorithm is an efficient one? – Stewart Apr 20 '18 at 13:41
4

In the worst case, you can't be faster than O(n^2). Indeed to check that all element are maximal for the poset where no element are comparable, you need to compare every pairs of elements. So it's definitely quadratic in the worst case.

Let me clarify to answer the comment below : I'm claiming that the worst case is attained when the poset is the trivial poset where no two elements are comparable. In this case, all elements are maximal. To check that this is indeed the case, any algorithm doing comparison must perform all n(n+1)/2 comparisons. Indeed, if a comparison say a <-> b is not performed, then the algorithm can't distinguish the trivial poset with the poset where the only relation is a < b so it can't give the correct answer. So any algorithm must be at least quadratic in the worst case.

hivert
  • 10,579
  • 3
  • 31
  • 56
  • 2
    Although the conclusion is probably correct, this is not a proof. You are just saying "You can't be faster than O(n²), because I don't know a faster way". – Stef Sep 06 '21 at 14:12
  • 1
    Se my edit where I'm clarifying my argument. – hivert Oct 14 '21 at 21:53
4

As other answers have pointed out, the worst case complexity is O(n^2).

However, there are heuristics that can help a lot in practice. For example if the set A is a subset of Z^2 (integer pairs), then we can eliminate a lot of points upfront by:

  1. Sorting along the x-axis (for a given x-value say 1, find the point with max y-value, repeat for all x-values) to get a candidate set of maximals, call it y-maximals.
  2. Similarly get the set x-maximals.
  3. Intersect to get final candidate set xy-maximals.

This is of cost O(n). It is easy to see that any maximal point will be present in xy-maximals. However, it can contain non-maximal points. For example, consider the set {(1,0), (0,1), (2,2)}.

Depending on your situation, this may be a good enough heuristic. You can follow this up with the exhaustive algorithm on the smaller set xy-maximals.

More generally, this problem is called the 'Pareto Frontier' calculation problem. Here are good references:

http://www.cs.yorku.ca/~jarek/papers/vldbj06/lessII.pdf

https://en.wikipedia.org/wiki/Pareto_efficiency#Use_in_engineering_and_economics

In particular the BEST algorithm from the first reference is quite useful.

akshan
  • 363
  • 2
  • 10
2

Suppose you have looked at all (n choose 2) comparisons except for one, between xi and xj, i != j. In some scenarios, the only two candidates for being maximal are exactly these two, xi and xj.

If you do not compare xi and xj, you cannot definitively say whether they are both maximal, or whether only one of them is.

Therefore, you must check all possible (n choose 2) (O(n2)) comparisons.


Note this assumes your partially ordered set is specified with a black box that will do a comparison. If the partially ordered set is given as a graph to start with, you can subsequently find the set of maximal elements in sub-O(n2) time.

Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
1

It might be true that without more information, the best solution would still be O(n^2), like other answers argue.

Though, if you know a linear extension (i.e. a total order that agrees with your partial order) then I think it can be done in O(n log n). For example, if your elements are strings or vectors/tuples with numbers, and your order is the product order, then the lexicographical (i.e. alphabetical) order is a linear extension. My proposed algorithm in this case is:

  1. Sort the list increasingly with respect to the linear extension (e.g. lexicographical order). Takes O(n log n).
  2. Move the last element to a new list, called maximals.
  3. For every remaining element in the list:
    1. If it is incomparable with the last element in maximals, add it to maximals.
  4. Return maximals

I have no reference and no proof for the correctness of this algorithm unfortunately, I just think that it's correct.