0

Finding common elements in list in python? Imagine if i have a list like follows [[a,b],[a,c],[b,c],[c,d],[e,f],[f,g]] My output must be [a,b,c,d] [e,f,g] How do i do it? What i tried is like this

for i in range(0,len(fin3)):
    for j in range(i+1,len(fin3)):
        grop = []
        grop = list(set(fin3[i]) & set(fin3[j]))
        if len(grop)>0:
            grop2 = []
            grop2.append(link[i])
            grop2.append(link[j])
            grop3.append(grop2)

Thanks in advance...

AstroCB
  • 12,337
  • 20
  • 57
  • 73
abhay
  • 1
  • 1
  • 1
  • 3
    Why should `[a,b,c,d]` and `[e,f,g]` be separate lists in the output? – tzaman Apr 18 '14 at 19:24
  • And what is the output of what you tried? – jonrsharpe Apr 18 '14 at 19:27
  • 2
    Are you implementing a [set consolidation](http://rosettacode.org/wiki/Set_consolidation)? (Merge every group that has a common element until there aren't any more to merge)? If so, there are already many questions about that. – DSM Apr 18 '14 at 19:33
  • @DSM There is also an answer in the link. :) – Quintec Apr 18 '14 at 19:35
  • related: [Replace list of list with “condensed” list of list while maintaining order](http://stackoverflow.com/q/13714755/4279). It shows solutions based on [connected components](http://stackoverflow.com/a/13896383/4279), based on [union-find algorithm for disjoint sets](http://stackoverflow.com/a/13716804/4279), and [ad hoc approaches](http://stackoverflow.com/a/13715626/4279). – jfs Apr 18 '14 at 23:20

4 Answers4

1

I think you want something like:

data = [[1, 2], [2, 3], [4, 5]]

output = []

for item1, item2 in data:
    for item_set in output:
        if item1 in item_set or item2 in item_set:
            item_set.update((item1, item2))
            break
    else:
        output.append(set((item1, item2)))

output = map(list, output)

This gives:

output == [[1, 2, 3], [4, 5]]
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • Not sure that works. Consider `data = [[0,0],[1,2],[2,0]]`; your code produces `[[0,2],[1,2]]`, but I think it should give `[[0,1,2]]`. – DSM Apr 18 '14 at 19:51
  • @DSM true, if there are subsequent pairs that link two existing sets this will break down – jonrsharpe Apr 18 '14 at 19:54
  • your algorithm is missing [recursive step that merges `output` after `item_set.update()`](http://stackoverflow.com/a/23163996/4279) – jfs Apr 18 '14 at 23:52
1

If you want to find common elements even if lists are no adjacent and if the order in the result doesn't matter:

def condense_sets(sets):
    result = []
    for candidate in sets:
        for current in result:
            if candidate & current:   # found overlap
                current |= candidate  # combine (merge sets)

                # new items from candidate may create an overlap
                # between current set and the remaining result sets
                result = condense_sets(result) # merge such sets
                break
        else:  # no common elements found (or result is empty)
            result.append(candidate)
    return result

Example:

>>> data = [['a','b'], ['a','c'], ['b','c'], ['c','d'], ['e','f'], ['f','g']]
>>> map(list, condense_sets(map(set, data)))
[['a', 'c', 'b', 'd'], ['e', 'g', 'f']]

See Replace list of list with “condensed” list of list while maintaining order.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

As was noted in a comment above, it looks like you want to do set consolidation.

Here's a solution I adapted from code at the link in that comment above.

def consolidate(seq):
    if len(seq) < 2:
        return seq
    result, tail = [seq[0]], consolidate(seq[1:])
    for item in tail:
        if result[0].intersection(item):
            result[0].update(item)
        else:
            result.append(item)
    return result

def main():
    sets = [set(pair) for pair in [['a','b'],['a','c'],['b','c'],['c','d'],['e','f'],['f','g']]]
    print("Input: {0}".format(sets))
    result = consolidate(sets)
    print("Result: {0}".format(result))

if __name__ == '__main__':
    main()

Sample output:

Input: [set(['a', 'b']), set(['a', 'c']), set(['c', 'b']), set(['c', 'd']), set(['e', 'f']), set(['g', 'f'])]
Result: [set(['a', 'c', 'b', 'd']), set(['e', 'g', 'f'])]
Paul Bissex
  • 1,611
  • 1
  • 17
  • 22
0

Another approach, which looks about as (in)efficient -- O(n^2) where n = number of items. It's not quite elegant, but it's correct. The following function returns a set of (hashable) frozensets if you supply the value True for the named argument return_sets, otherwise it returns a list of lists (the default, as your question indicates that's what you really want):

def create_equivalence_classes(relation, return_sets=False):
    eq_class = {}
    for x, y in relation:
        # Use tuples of x, y in case either is a string of length > 1 (iterable),
        # and so that elements x, y can be noniterables such as ints.
        eq_class_x = eq_class.get(x, frozenset( (x,) ))
        eq_class_y = eq_class.get(y, frozenset( (y,) ))
        join = eq_class_x.union(eq_class_y)
        for u in eq_class_x:
            eq_class[u] = join
        for v in eq_class_y:
            eq_class[v] = join
    set_of_eq_classes = set(eq_class.values())
    if return_sets:
        return set_of_eq_classes
    else:
        return list(map(list, set_of_eq_classes))

Usage:

>>> data = [['a','b'], ['a','c'], ['b','c'], ['c','d'], ['e','f'], ['f','g']]
>>> print(create_equivalence_classes(data))
[['d', 'c', 'b', 'a'], ['g', 'f', 'e']]
>>> print(create_equivalence_classes(data, return_sets=False))
{frozenset({'d', 'c', 'b', 'a'}), frozenset({'g', 'f', 'e'})}

>>> data1 = [['aa','bb'], ['bb','cc'], ['bb','dd'], ['fff','ggg'], ['ggg','hhh']]
>>> print(create_equivalence_classes(data1))
[['bb', 'aa', 'dd', 'cc'], ['hhh', 'fff', 'ggg']]

>>> data2 = [[0,1], [2,3], [0,2], [16, 17], [21, 21], [18, 16]]
>>> print(create_equivalence_classes(data2))
[[21], [0, 1, 2, 3], [16, 17, 18]]
BrianO
  • 1,496
  • 9
  • 12