1

I have a list like this in Python:

[('a', 'b'), ('a', 'c'),('d','f')]

and I want join items that have same first item and result like this:

[('a', 'b', 'c'),('d','f')]
  • 3
    What have you tried, and what exactly is the problem with it? Also your input appears to be a string, not a list (it wouldn't be valid syntactically as a list). – jonrsharpe Mar 30 '19 at 22:51
  • 2
    Are `d` and `f` supposed to be `'d'` and `'f'`? Is the inner type tuple (as you're written) or lists (as per the title)? – moonGoose Mar 30 '19 at 22:55
  • edit it again :) – Amir Hossein Mar 30 '19 at 23:09
  • @PatrickArtner ohk, I just edited on best anticipated thing, as you see if a b c are string and the way he wrote d and e, surely means he wanted them to be string too, rather than being a variable name. – Vicrobot Mar 31 '19 at 03:57
  • 1
    @PatrikArtner Also most of the newbie users do the same mistake. Although I agree on your suggestion, thanks. – Vicrobot Mar 31 '19 at 04:05

3 Answers3

1

Here is one way to do it. For efficiency, we build a dict with the first value as key. We keep the values in the order in which they appear (and the tuples in their original order as well, if you use Python >= 3.7 - otherwise you will have to use a collections.OrderedDict)

def join_by_first(sequences):
    out = {}
    for seq in sequences:
        try:
            out[seq[0]].extend(seq[1:])
        except KeyError:
            out[seq[0]] = list(seq)
    return [tuple(values) for values in out.values()]

join_by_first([('a', 'b'), ('a', 'c'),('d','f')])
# [('a', 'b', 'c'), ('d', 'f')]
Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50
0

You can not edit tuples - the are immuteable. You can use lists and convert all back to tuples afterward:

data = [('a', 'b'), ('a', 'c'),('d','f')]

new_data = []


for d in data                                             # loop over your data
    if new_data and new_data[-1][0] == d[0]:              # if something in new_data and 1st
        new_data[-1].extend(d[1:])                        # ones are identical: extend
    else:
        new_data.append( [a for a in d] )                 # not same/nothing in: add items

print(new_data)                   # all are lists

new_data = [tuple(x) for x in new_data]
print(new_data)                   # all are tuples again      

Output:

[['a', 'b', 'c'], ['d', 'f']]     # all are lists
[('a', 'b', 'c'), ('d', 'f')]     # all are tuples again   

See Immutable vs Mutable types

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
0

I feel like the simplest solution is to build a dictionary in which:

  • keys are the first items in the tuples
  • values are lists comporting all second items from the tuples

Once we have that we can then build the output list:

from collections import defaultdict

def merge(pairs):
    mapping = defaultdict(list)
    for k, v in pairs:
        mapping[k].append(v)
    return [(k, *v) for k, v in mapping.items()]

pairs = [('a', 'b'), ('a', 'c'),('d','f')]
print(merge(pairs))

This outputs:

[('a', 'b', 'c'), ('d', 'f')]

This solution is in O(n) as we only iterate two times over each item from pairs.

cglacet
  • 8,873
  • 4
  • 45
  • 60