-1

suppose the list

[7,7,7,7,3,1,5,5,1,4]

I would like to remove duplicates and get them counted while preserving the order of the list. To preserve the order of the list removing duplicates i use the function

def unique(seq, idfun=None):
   # order preserving
   if idfun is None:
       def idfun(x): return x
   seen = {}
   result = []
   for item in seq:
       marker = idfun(item)
       if marker in seen: continue
       seen[marker] = 1
       result.append(item)
   return result

that is giving to me the output

[7,3,1,5,1,4]

but the desired output i want would be (in the final list could exists) is:

[7,3,3,1,5,2,4]

7 is written because it's the first item in the list, then the following is checked if it's the different from the previous. If the answer is yes count the occurrences of the same item until a new one is found. Then repeat the procedure. Anyone more skilled than me that could give me a hint in order to get the desired output listed above? Thank you in advance

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140

2 Answers2

1

Perhaps something like this?

>>> from itertools import groupby
>>> seen = set()
>>> out = []
>>> for k, g in groupby(lst):
    if k not in seen:
        length = sum(1 for _ in g)
        if length > 1:
            out.extend([k, length])
        else:
            out.append(k)
        seen.add(k)
...         
>>> out
[7, 4, 3, 1, 5, 2, 4]

Update:

As per your comment I guess you wanted something like this:

>>> out = []
>>> for k, g in groupby(lst):
    length = sum(1 for _ in g)
    if length > 1:
        out.extend([k, length])
    else:
        out.append(k)
...         
>>> out
[7, 4, 3, 1, 5, 2, 1, 4]
Community
  • 1
  • 1
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • 1
    true it is, its a weird structure for a list (I don't know what OP uses it for) though but this does solve it – jamylak Jan 17 '15 at 13:31
0

Try this

import collections as c
lst = [7,7,7,7,3,1,5,5,1,4]
result = c.OrderedDict()
for el in lst:
    if el not in result.keys():
        result[el] = 1
    else:
        result[el] = result[el] + 1

print result

prints out: OrderedDict([(7, 4), (3, 1), (1, 2), (5, 2), (4, 1)])

It gives a dictionary though. For a list, use:

lstresult = []
for el in result:
    # print k, v
    lstresult.append(el)
    if result[el] > 1:
        lstresult.append(result[el] - 1)

It doesn't match your desired output but your desired output also seems like kind of a mangling of what is trying to be represented

Brian Leach
  • 3,974
  • 8
  • 36
  • 75
  • 2
    `.keys()` is unnecessary and in Python 2 will load a list into memory and make the `in` check O(N) instead of O(1). `el not in result` works fine – jamylak Jan 17 '15 at 13:26