5

I want change the numpy column data type, but when I to replace the original numpy column, the dtype will not change succesfully.

import numpy as np 

arraylist =[(1526869384.273246, 0, 'a0'),
(1526869385.273246, 1, 'a1'),
(1526869386.273246, 2, 'a2'),
(1526869387.273246, 3, 'a3'),
(1526869388.273246, 4, 'a4'),
(1526869389.273246, 5, 'a5'),
(1526869390.273246, 6, 'a6'),
(1526869391.273246, 7, 'a7'),
(1526869392.273246, 8, 'a8'),
(1526869393.273246, 9, 'a9'),
(1526869384.273246, 0, 'a0'),
(1526869385.273246, 1, 'a1'),
(1526869386.273246, 2, 'a2'),
(1526869387.273246, 3, 'a3'),
(1526869388.273246, 4, 'a4'),
(1526869389.273246, 5, 'a5'),
(1526869390.273246, 6, 'a6'),
(1526869391.273246, 7, 'a7'),
(1526869392.273246, 8, 'a8'),
(1526869393.273246, 9, 'a9')]

array =  np.array(arraylist)

array.dtype

dtype('<U32')

array[:,0]=array[:,0].astype("float64")
array[:,0].dtype 

>>> dtype('<U32') 

Event through I changed the dtype of the column, but why I want to replace the orignal column it's still u32?

cs95
  • 379,657
  • 97
  • 704
  • 746
nooper
  • 691
  • 1
  • 9
  • 25
  • Read https://stackoverflow.com/questions/49751000/how-does-numpy-determin-the-arrays-dtype-and-what-it-means/49751834#49751834 and https://stackoverflow.com/questions/11309739/store-different-datatypes-in-one-numpy-array?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa – Mazdak May 21 '18 at 05:54
  • As a default `np.array` assigns the best common `dtype` to the whole array, in this case, a string. Once created that dtype is fixed, and can't be changed by simple assignment. Consider structured arrays or object dtype arrays if you must mix floats and strings. But beware that those come with an increased processing cost. – hpaulj May 21 '18 at 06:22
  • Since you have a list of tuples, creating a structured array with 3 fields will be relatively easy. – hpaulj May 21 '18 at 06:27

1 Answers1

6

If you're okay with named columns, you can define a tuple of dtypes and assign them to array during creation:

dtype = [('A', 'float'), ('B', 'int'), ('C', '<U32')]
array = np.array(arraylist, dtype=dtype)

array['A'].dtype  # note, array[: 0] does not work here since these are named columns
dtype('float64')
cs95
  • 379,657
  • 97
  • 704
  • 746