1

Given a list of objects and a non-transitive equality function that returns true when two objects are equal, otherwise returns false, I need to find all largest sublists where at least two objects are equal. For example -

val list = List(o1, o2, o3, o4, o5)

and,

isEqual(o1, o2) => true
isEqual(o2, o4) => true
isEqual(o3, o5) => true

The result will be:

List(o1, o2, o4)
List(o3, o5)

Please note that isEqual is non-transitive, i.e. in the case above o1 may not be equal to o4 even though they belong to the same sublist.

Shirish Kumar
  • 1,532
  • 17
  • 23
  • If at least two objects must be equal, then isn't the list itself the largest subsequence since it already contains at least one pair of equal objects? – EvilTak Aug 08 '17 at 14:11
  • @Evil Tak - you are correct. I just wanted to be explicit. – Shirish Kumar Aug 08 '17 at 14:47
  • One obvious solution is to first generate all possible tuple that are equal - which will take O(N^2) and then find connected components. I am wondering if this can be done faster. – Shirish Kumar Aug 08 '17 at 14:48

2 Answers2

0

Your problem is equal to the problem of finding all connected components of a graph.

So first thing to do is, to convert your list to a graph G(V, E) where V stands for the vertices and E stands for the edges:

V = list
E = {(o1,o2) for all o1,o2 in list| o1.Equals(o2)}

After that make a DFS to find all components

WHILE-EXISTS unvisted node in G DO
     component[i] = DFS(G)
END

Of course the components a Graphs itself. The components are the lists you are looking for and the vertices in the components are the elements of the list.

For your example the graph would look like this

enter image description here

NOTICE: Since you have to compare each object the conversation will take O(n^2). To find all components will take you O(n). So this algorithm has an asymptotic runtime of O(n^2)

Answer to the comment in your question

Since the conversion of your problem to this graph problem seems correct, I am pretty sure it is not possible. If you see it as a graph, you simply have to check each node if it is connected with each other node. You also can not stop after you find one equal node, because you maybe will find another node which is equal and by stoping, you would split the connected component.

Andreas
  • 309
  • 1
  • 8
0

You can use Disjoint set union algorithm to find all the connected components. And then print the list. The time complexity of below code is O(NlogN). The weighted_union reduces the time complexity of union to logN. So if me perform union N number of time in worst case it will take NlogN.

#include <bits/stdc++.h>
using namespace std;

int Arr[100], size[100];

int root (int i)
{
    while(Arr[ i ] != i)
    {
        Arr[ i ] = Arr[ Arr[ i ] ] ; 
        i = Arr[ i ]; 
    }
    return i;
}

void weighted_union(int A,int B)
{
    int root_A = root(A);
    int root_B = root(B);
    if(size[root_A] < size[root_B ])
    {
        Arr[ root_A ] = Arr[root_B];
        size[root_B] += size[root_A];
    }
    else
    {
        Arr[ root_B ] = Arr[root_A];
        size[root_A] += size[root_B];
    }
}

void initialize( int N)
{
    for(int i = 0;i<N;i++)
    {
        Arr[ i ] = i ;
        size[ i ] = 1;
    }
}

int main() {
    // your code goes here
    initialize(6);
    weighted_union(1,2);
    weighted_union(2,4);
    weighted_union(3,5);


    map<int, vector<int> >m;
    for (int i=1;i<=5;i++) {
        if(m.find(Arr[i])!=m.end()){
            vector<int> x = m[Arr[i]];
            x.push_back(i);
            m[Arr[i]] = x;
        } else {
            vector<int> x;
            x.push_back(i);
            m[Arr[i]]=x;
        }
    }

    for (std::map<int,vector<int> >::iterator it=m.begin(); it!=m.end(); ++it) {
        vector<int> x = it->second;
        for(int j=0;j<x.size();++j) {
            cout<<x[j]<<" ";
        }
        cout<<endl;
    }

    return 0;
}

You can find the link of your solution here : http://ideone.com/vsT9Jh

sourabh1024
  • 647
  • 5
  • 15
  • Thanks for the response. But with this approach it will still take O(N^2) to find all possible unions. – Shirish Kumar Aug 08 '17 at 16:16
  • No this will give you all possible unions in O(NlogN) + O(different_set * No_of_elements_in_set) – sourabh1024 Aug 08 '17 at 16:42
  • You can study more about its time complexity here : https://www.hackerearth.com/practice/notes/disjoint-set-union-union-find/ – sourabh1024 Aug 08 '17 at 16:45
  • What you have described above is weighted union find problem. Above, you have assumed that the three unions (`weighted_union(1,2)`, `weighted_union(2,4)` and `weighted_union(3,5)`) are available. Unless I am missing something, in my case, if I use union find or connected component approach, I need to first find these tuples connectivity - and that would take n^2 time. – Shirish Kumar Aug 08 '17 at 17:57
  • Yes that would require all relations. you cannt optimise that as its non transitive in nature – sourabh1024 Aug 08 '17 at 20:04