Python - Join Two List Object Eachother If Their First Name Uppercase

Question

Is it possible to group Uppercased words which are sequential ?

For example i have a list like that:

lst =[['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],['smarter'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

Desired Output:

[['John'],['is'],['smart'],[','],['John','Kenneddy'],['is'],['smarter'],[','],['John','Fitzgerald','Kennedy'],['is'],['best']]

basically first list contains tokenized words i want to group some uppercased sequentail array members which should count one — Arda Nalbant, Apr 28 '16 at 10:46

score 5 · Accepted Answer · answered Apr 28 '16 at 10:59

5

You can utilize groupby to group the words by starting letter:

from itertools import groupby

d = [['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

sum(([[x[0] for x in g]] if k else list(g)
     for k, g in groupby(d, key=lambda x: x[0][0].isupper())),
    [])

answered Apr 28 '16 at 10:59

niemmi

17,113
7
35
42

Thank you so much worked lika a charm. Exactly what i want – Arda Nalbant Apr 28 '16 at 11:28
it's a nice solution! But do you know how to make working `list(g)` under Python 3 - it throws `TypeError: 'list' object is not callable`? – MaxU - stand with Ukraine Apr 28 '16 at 12:01
1

@MaxU: If you copy and paste the solution as such it should work on Python 3, I tested with 3.5 and it produces exactly the same result. Can you check that the initial variable is not named as `list` as in question but something else, like `d` in my example? That's exactly the reason I changed the original name of the variable so that it wouldn't hide builtin `list`. – niemmi Apr 28 '16 at 13:04

onkar · Answer 2 · 2016-04-28T11:13:40.697

0

list =[['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],['smarter'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

  upperlist=[]
   tmp = 0
   for l in list:
        if l[0][0].isupper():
         if tmp != 0 and list[tmp-1] != ",":
            u =list[tmp-1]+l
            print(u)
            if u[0] == ',':
              if l not in upperlist:
               upperlist.append(l)
            else:

                  upperlist.append(u)
         else:

              upperlist.append(l)
        else:

            upperlist.append(l)
        tmp = tmp+1

print(upperlist)

edited Apr 28 '16 at 11:13

answered Apr 28 '16 at 10:51

onkar

4,427
10
52
89

good job here thanks but out put is: [['John'], ['is'], ['smart'], [','], ['John'], ['John', 'Kenneddy'], ['is'], ['smarter'], [','], ['John'], ['John', 'Fitzgerald'], ['Fitzgerald', 'Kennedy'], ['is'], ['best']] . Can you remove the excessive words ? – Arda Nalbant Apr 28 '16 at 10:56

Python - Join Two List Object Eachother If Their First Name Uppercase

2 Answers2