1

I am attempting to produce a transformation matrix based on the first column of a csv file. There are 59 possible states, which are integers in a list.

At the moment my output is simply a long list of '0.000s' - in the correct size as far as I can tell (59*59), but without the weights that it should have.

The data is definitely in the activities list, because it can be printed separately.

import csv
data = open('file.csv')

csv_data = csv.reader(data)
data_lines = list(csv_data)

activities = []

for row in data_lines[:5954]:
    activities.append(row[0])

activities = list(map(int, activities))

def transition_matrix(activities):
    n = 59 #number of states

    m = [[0]*n for _ in range(n)]

    for (i,j) in zip(activities,activities[1:]):
        m[i][j] += 1

    #now convert to probabilities:
    for row in m:
        s = sum(row)
        if s > 0:
            row[:] = [f/s for f in row]
    return m

t = [activities]
m = transition_matrix(t)
for row in m: print(' '.join('{0:.3f}'.format(x) for x in row))
  • `t = [activities]` looks weird, `t` is now a new list that contains one element and that one element is your long `activities` list. Why not just pass `activities` directly to `transition_matrix()`? – turbulencetoo Mar 09 '18 at 17:02

2 Answers2

0

You are declaring t as a list which contains activities.

However, activities is already a list.

This means that your function won't run as expected as you're not passing data in the right format. To fix this, just use t = activities, or just directly pass activites into transition_matrix() without using t.

Adi219
  • 4,712
  • 2
  • 20
  • 43
-1

It looks like you're creating a nested list when you assign t. Try removing the brackets around [activities].

krflol
  • 1,105
  • 7
  • 12