I want to create an array of arrays of the structure:
[line_number,count,temperature,humidity,sensor1_on,sensor2_on]
Where the first two need to be uint32
, while temperature and humidity can be uint8
, and the sensor_on
s can be of type bool
.
I later need to sort the 2d array based on the combination of line_number
and then count. I also need to perform averages and other statistical computation on lists of all the temperature and humidity data (separately).
I found structured arrays which are convenient for data storage and retrieval:
np_data=np.zeros([num_lines],
dtype='uint32,'#Line No
'uint32,'# Count
'uint8,' #TEMP
'uint8,' #HUMID
'bool,' #S1 On
'bool'#S2 On
)
for this vs
np_data=np.zeros([num_lines,5],dtype='uint32')
# I would pack my bools into the last uint32 and then unpack later
# but it seems like a waste of space
Do I lose anything (numpy processing power, vectorized processing, sorting speed, etc) by creating the structured array vs the one with all the same data types? Is there another solution one would recommend?