1

I'm trying to multiply every column in an ndarray by a scalar. When I try to do this, I get the error TypeError: invalid type promotion.

I've tried using array.astype(float), but this gives all NaNs.

array = np.genfromtxt("file.csv", dtype=float, delimiter='\t', names=True)

newarray = array*4.0

file.csv has a number of column headers. For example:

array['col_a'] = [5.0, 6.0]

After multiplying by the scalar, I want: newarray['col_a'] to be [20.0, 24.0]

tel
  • 13,005
  • 2
  • 44
  • 62
gilmour
  • 23
  • 1
  • 4
  • 2
    You have to multiply each field separately. Or omit the `names` parameter and get a 2d array of dtype float. – hpaulj Jan 12 '19 at 05:17

1 Answers1

1

I'm honestly amazed that this has never come up in my own code, but it turns out that Numpy structured arrays (ie arrays with field names) don't support the standard arithmetic operators +, -, *, or / (see footnote *).

Thus, your only option is to work with a non-structured version of your array. @hpaulj's comment points out the ways you can do so (this old answer contains a thorough exploration of exactly how you can get addition to work with structured arrays.). Either index a single field (the result of which behaves like a standard array) and multiply that:

import numpy as np
from io import StringIO

csv = '''col_a\tcol_b\tcol_c
5.0\t19.6\t22.8
6.0\t42.42\t39.208
'''

arr = np.genfromtxt(StringIO(csv), dtype=np.float64, delimiter='\t', names=True)

xcol_a = arr['col_a']*4
print(xcol_a)
# output: [20. 24.]

or omit the names=True kwarg when you generate your array (which makes np.genfromtxt return a standard array instead of a structured one):

arrstd = np.genfromtxt(StringIO(csv), dtype=np.float64, delimiter='\t', skip_header=True)

print(arrstd*4)
# output: [[ 20.     78.4    91.2  ]
#          [ 24.    169.68  156.832]]

*: Technically, it appears that many of Numpy's built-in ufunc's are not supported when working with structured arrays. At least some of the comparison functions/operators (<, >, and ==) are supported.

tel
  • 13,005
  • 2
  • 44
  • 62
  • Just in case any one else has this problem, I used genfromtext to get the names for each column, used tips from above to do the multiplication, and then used [this post](https://stackoverflow.com/questions/10742406/programmatically-add-column-names-to-numpy-ndarray?rq=1) to add back the column names. – gilmour Jan 13 '19 at 00:03