0

Here is an example dataset

Firstly, I try to create a dict from values in rows:

import csv
    
who = set()
figure = set()
date = set()
action = []

activity = {'play': 0, 'throw': 0, 'pin': 0, 'tap': 0}


with open(r'ShtrudelT.csv',
         mode = 'r') as csv_file:
    
    lines = csv_file.readlines()
    
    for row in lines:
        
        data = row.split(',')
        
        who.add(data[1])
        figure.add(data[2])
        date.add(data[3][:7])
        action.append(data[4].strip())
        
        xdict = dict.fromkeys(who, 
                dict.fromkeys(figure, 
                dict.fromkeys(date, activity)))

The result is:

{'Googenhaim': {'Circle': {'2020-04': {'play': 0,'throw': 0, 'pin': 0, 'tap': 0},
   '2020-06': {'play': 0, 'throw': 0, 'pin': 0, 'tap': 0},
   '2020-05': {'play': 0, 'throw': 0, 'pin': 0, 'tap': 0}},
  'Rectangle': {'2020-04': {'play': 0, 'throw': 0, 'pin': 0, 'tap': 0}...}

Secondly, I need to count actions divided by key to analyze data. For example, how many times Googenhaim use Circle by any type of action in every month.

Is there a solution without using Pandas?

  • Hi. Can you please clarify with an example what do want to compute? Do you want to compute the count of actions for every possible value of "who", "figure", "date"? – ranka47 Jul 06 '20 at 15:58
  • For example, I need to count how many times Googenhaim take every action with Circle in every month (and so on). – Vladimir Abramov Jul 06 '20 at 16:20

1 Answers1

1
import csv

count_dict = {}

with open(r'ShtrudelT.csv',
         mode = 'r') as csv_file:
    
    lines = csv_file.readlines()
    
    for row in lines:
        
        data = row.split(',')
        key = data[1] + "_" + data[2] + "_" + data[3][:7] + "_" + data[4].strip()
        if key in count_dict:
            count_dict[key] += 1
        else:
            count_dict[key] = 1

print("\t".join(["Name", "Shape", "Month", "Action", "Count"]))
for element, count in count_dict.items():
    items = element.split("_")
    print("\t".join(items) + "\t" + str(count)) 

We use a dictionary where every key is the combination that we want to count. This combination is formed from name of the user, shape, month and the action. While processing every line, we form the key and store it in the dictionary. If it is encountered for the first time then we insert it or else we update the count.

After all the lines are processed, we can do any kind of post processing we want to do. Hope that solves it.

ranka47
  • 995
  • 8
  • 25