Find tuple in list with same first item and return another list

Question

I have a list like this in Python:

[('a', 'b'), ('a', 'c'),('d','f')]

and I want join items that have same first item and result like this:

[('a', 'b', 'c'),('d','f')]

What have you tried, and what exactly is the problem with it? Also your input appears to be a string, not a list (it wouldn't be valid syntactically as a list). — jonrsharpe, Mar 30 '19 at 22:51
Are `d` and `f` supposed to be `'d'` and `'f'`? Is the inner type tuple (as you're written) or lists (as per the title)? — moonGoose, Mar 30 '19 at 22:55
@PatrickArtner ohk, I just edited on best anticipated thing, as you see if a b c are string and the way he wrote d and e, surely means he wanted them to be string too, rather than being a variable name. — Vicrobot, Mar 31 '19 at 03:57
@PatrikArtner Also most of the newbie users do the same mistake. Although I agree on your suggestion, thanks. — Vicrobot, Mar 31 '19 at 04:05

score 1 · Answer 1 · answered Mar 30 '19 at 23:11

Here is one way to do it. For efficiency, we build a dict with the first value as key. We keep the values in the order in which they appear (and the tuples in their original order as well, if you use Python >= 3.7 - otherwise you will have to use a collections.OrderedDict)

def join_by_first(sequences):
    out = {}
    for seq in sequences:
        try:
            out[seq[0]].extend(seq[1:])
        except KeyError:
            out[seq[0]] = list(seq)
    return [tuple(values) for values in out.values()]

join_by_first([('a', 'b'), ('a', 'c'),('d','f')])
# [('a', 'b', 'c'), ('d', 'f')]

score 0 · Answer 2 · answered Mar 30 '19 at 23:11

You can not edit tuples - the are immuteable. You can use lists and convert all back to tuples afterward:

data = [('a', 'b'), ('a', 'c'),('d','f')]

new_data = []


for d in data                                             # loop over your data
    if new_data and new_data[-1][0] == d[0]:              # if something in new_data and 1st
        new_data[-1].extend(d[1:])                        # ones are identical: extend
    else:
        new_data.append( [a for a in d] )                 # not same/nothing in: add items

print(new_data)                   # all are lists

new_data = [tuple(x) for x in new_data]
print(new_data)                   # all are tuples again

Output:

[['a', 'b', 'c'], ['d', 'f']]     # all are lists
[('a', 'b', 'c'), ('d', 'f')]     # all are tuples again

See Immutable vs Mutable types

score 0 · Answer 3 · answered Mar 30 '19 at 23:15

I feel like the simplest solution is to build a dictionary in which:

keys are the first items in the tuples
values are lists comporting all second items from the tuples

Once we have that we can then build the output list:

from collections import defaultdict

def merge(pairs):
    mapping = defaultdict(list)
    for k, v in pairs:
        mapping[k].append(v)
    return [(k, *v) for k, v in mapping.items()]

pairs = [('a', 'b'), ('a', 'c'),('d','f')]
print(merge(pairs))

This outputs:

[('a', 'b', 'c'), ('d', 'f')]

This solution is in O(n) as we only iterate two times over each item from pairs.

Find tuple in list with same first item and return another list

3 Answers3