Collecting all the indices of unique elements in CSV file and populating them in a row

Question

I have a set of data in CSV file like this:

[['1', '1.5', '1', '2', '1.5', '2'],
 ['2', '2.5', '3', '2.5', '3', '2.5'],
 ['3', '2.5', '1.5', '1', '1', '3'],
 ['1.5', '1', '2', '2', '2', '2.5'],
 ['1.5', '1.5', '1', '2.5', '1', '3']]

I want to find the all the unique entries in this data listed in ascending order. I have tried this code:

import csv
import numpy 

    dim1=[]                                                                        
    with open('D:/TABLE/unique_values.csv') as f1:
        for rows in f1.readlines():
            dim1.append(rows.strip().split(','))    
            
            
    uniqueValues = numpy.unique(dim1)
    print('Unique Values : ',uniqueValues)

and it gives me this output :

Unique Values :  ['1' '1.5' '2' '2.5' '3']

I want to list these unique entries in the column in CSV file and want to write their running indices in a row against each unique entry. A sample output which is desired is shown below.

Sample Output

I have tried other numpy functions but they only return the first occurrence of unique entry. Also, I have seen other relevant posts but they do not populate the running indices of each unique element in a row.

That seems like a pretty obscure transformation to me. I don't expect that you'll find a standard function in numpy or anywhere else that's going to do that for you. It wouldn't be all that hard to code up yourself though. Just create a map with keys that are each value in the table, and with values that are lists containing each of the positions at which the associated key appears. You could easily walk the input table and build that. Then it would be an easy thing to write the contents of that map out just the way you want the new table to look. — CryptoFool, Oct 29 '20 at 05:23

RootTwo · Accepted Answer · 2020-10-29T23:57:32.940

1

This would be fairly straight forward with some functions from the standard library: collections.defaultdict. csv.reader, and itertools.count. Something like:

import csv
import collections 
import itertools

data = collections.defaultdict(list)                                                                        

index = itertools.count(1)
with open('D:/TABLE/unique_values.csv') as f1:
    reader = csv.reader(f1)

    for row in reader:
        for value in row:
            data[value].append(next(index))    
            
for unique_value, indices in data.items():
    print(f"{unique_value}:", *indices)

edited Oct 29 '20 at 23:57

answered Oct 29 '20 at 06:13

RootTwo

4,288
1
11
15

Thanks a lot. When I tried to run this piece of code, it issued an error that "module 'itertools' has no attribute 'counter". What does that mean? – Supernova Oct 29 '20 at 11:55
I can see that you edited the code but it is still giving the same error. I am bit confused that what is going on. – Supernova Oct 29 '20 at 18:30
I did some searching online and found the relevant information here (https://docs.python.org/3/library/itertools.html) about itertools. I changed Counter() to count() and program worked out. Thank you :) – Supernova Oct 29 '20 at 19:41

Collecting all the indices of unique elements in CSV file and populating them in a row

1 Answers1