4

Is it possible to group Uppercased words which are sequential ?

For example i have a list like that:

lst =[['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],['smarter'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

Desired Output:

[['John'],['is'],['smart'],[','],['John','Kenneddy'],['is'],['smarter'],[','],['John','Fitzgerald','Kennedy'],['is'],['best']]
Baba
  • 47
  • 8
Arda Nalbant
  • 479
  • 2
  • 7
  • 16

2 Answers2

5

You can utilize groupby to group the words by starting letter:

from itertools import groupby

d = [['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

sum(([[x[0] for x in g]] if k else list(g)
     for k, g in groupby(d, key=lambda x: x[0][0].isupper())),
    [])
niemmi
  • 17,113
  • 7
  • 35
  • 42
  • Thank you so much worked lika a charm. Exactly what i want – Arda Nalbant Apr 28 '16 at 11:28
  • it's a nice solution! But do you know how to make working `list(g)` under Python 3 - it throws `TypeError: 'list' object is not callable`? – MaxU - stand with Ukraine Apr 28 '16 at 12:01
  • 1
    @MaxU: If you copy and paste the solution as such it should work on Python 3, I tested with 3.5 and it produces exactly the same result. Can you check that the initial variable is not named as `list` as in question but something else, like `d` in my example? That's exactly the reason I changed the original name of the variable so that it wouldn't hide builtin `list`. – niemmi Apr 28 '16 at 13:04
0

list =[['John'],['is'],['smart'],[','],['John'],['Kenneddy'],['is'],['smarter'],[','],['John'],['Fitzgerald'],['Kennedy'],['is'],['best']]

  upperlist=[]
   tmp = 0
   for l in list:
        if l[0][0].isupper():
         if tmp != 0 and list[tmp-1] != ",":
            u =list[tmp-1]+l
            print(u)
            if u[0] == ',':
              if l not in upperlist:
               upperlist.append(l)
            else:

                  upperlist.append(u)
         else:

              upperlist.append(l)
        else:

            upperlist.append(l)
        tmp = tmp+1

print(upperlist)
onkar
  • 4,427
  • 10
  • 52
  • 89
  • good job here thanks but out put is: [['John'], ['is'], ['smart'], [','], ['John'], ['John', 'Kenneddy'], ['is'], ['smarter'], [','], ['John'], ['John', 'Fitzgerald'], ['Fitzgerald', 'Kennedy'], ['is'], ['best']] . Can you remove the excessive words ? – Arda Nalbant Apr 28 '16 at 10:56