Iterate and change values of python numpy matrix columns

Question

I have a numpy matrix containing numbers.

1,0,1,1
0,1,1,1
0,0,1,0
1,1,1,1

I would like to perform a Z-Score Normalization over each column; z_Score[y] = (y-mean(column))/sqrt(var) y being each element in the column, mean being the mean function, sqrt the squared root function and var the variance.

My Approach was the following:

x_trainT = x_train.T #transpose the matrix to iterate over columns
for item in x_trainT:
    m = item.mean()
    var = np.sqrt(item.var())
    item = (item - m)/var
x_train = x_trainT.T

I thought that upon iteration, each row is accessed by reference, (like in c# lists for instance), therefore allowing me to change the matrix values through changing row values.
However I was wrong, since the matrix keeps its original values intact.

Your help is appreciated.

Possible duplicate of [computing z-scores for 2D matrices in scipy/numpy in Python](https://stackoverflow.com/questions/2985135/computing-z-scores-for-2d-matrices-in-scipy-numpy-in-python) — Ruzihm, Oct 10 '19 at 06:40
`item=...` assigns a new object to `item`, breaking its link with iteration variable. So you aren't modifying the array. — hpaulj, Oct 10 '19 at 07:00

score 2 · Answer 1 · answered Oct 10 '19 at 06:43

2

I'd recommend you to avoid iterations when possible. You can compute the mean and std in a 'column wise' manner.

>>> import numpy as np
>>> x_train = np.random.random((5, 8))
>>> norm_x_train = (x_train  - x_train.mean(axis=0)) / x_train.std(axis=0)

answered Oct 10 '19 at 06:43

Guillem

2,376
2
18
35

score 1 · Answer 2 · answered Oct 10 '19 at 06:38

1

You'll likely have to index over row number:

x_trainT = x_train.T
for i in range(x_trainT.shape[0]):
    item = x_trainT[i]
    m = item.mean()
    sd = np.sqrt(item.var())
    x_trainT[i] = (item - m)/sd
x_trainT = x_train.T

answered Oct 10 '19 at 06:38

Daniel Nguyen

419
2
7

1

Skip the transposes, and use `x_train[:,i]`. – hpaulj Oct 10 '19 at 07:01

Iterate and change values of python numpy matrix columns

2 Answers2