0

I'm using numpy 1.8.2 and the following code results in the error below:

import numpy as np

data = []
data.append(['2015-01-03 05:00:00', 5, 5.01])
data.append(['2015-01-04 05:00:00', 7, 7.01])
data.append(['2015-01-05 05:00:00', 8, 8.01])
data.append(['2015-01-06 05:00:00', 10, 10.01])

dt = np.dtype('M8', '<f8', '<f8')

np.array(data, dtype=dt)

produces the following output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call     last)
<ipython-input-24-a3d77026bff9> in <module>()
  9 dt = np.dtype('M8', '<f8', '<f8')
 10 
---> 11 np.array(data, dtype=dt)

ValueError: Could not convert object to NumPy datetime

Is there something I'm doing wrong?

I'm especially confused because

np.datetime64('2015-01-06 05:00:00')

produces the expected output:

numpy.datetime64('2015-01-06T05:00:00-0500')
gstanley
  • 73
  • 2
  • 11

1 Answers1

1

Yes, what you're sending in the data list is actually this:

>>> data[0]
['2015-01-03 05:00:00', 5, 5.01]

But what you're testing your conversion on is this:

'2015-01-03 05:00:00'

One is a string and the other is a list. Numpy won't, to my knowledge, look inside the list. Bellow code demonstrates the differences.

data = []
data.append('2015-01-03 05:00:00')
data.append('2015-01-04 05:00:00')
data.append('2015-01-05 05:00:00')
np.array(data, dtype=dt)

#output
array(['2015-01-03T05:00:00+0100', '2015-01-04T05:00:00+0100',
       '2015-01-05T05:00:00+0100'], dtype='datetime64[s]')

The way to get your code to work would be to convert just the first element of each list and then append that to a list dates.

for i in range(len(data)):
    date = np.array(data[i][0], dtype=dt)
    data[i][0] = date

This can be done better than a for loop (it could take some time for larger lists). If you have to have such a complex array, isn't it just easier to handle it with a class, or have different multiple arrays each holding it's own data?

    >>> data
[[array(datetime.datetime(2015, 1, 3, 4, 0), dtype='datetime64[s]'), 5, 5.01], 
[array(datetime.datetime(2015, 1, 4, 4, 0), dtype='datetime64[s]'), 7, 7.01], 
[array(datetime.datetime(2015, 1, 5, 4, 0), dtype='datetime64[s]'), 8, 8.01], 
[array(datetime.datetime(2015, 1, 6, 4, 0), dtype='datetime64[s]'), 10, 10.01]]

You started with a list of lists and you get a list of arrays. Optionally you could get an array of arrays if you also did a np.asarray(data) which won't cause an error this time.

I should also probably mention, that the np.dtype, as I saw it being used, is mostly intended to describe the array outline. I believe the idea is that you should first declare a np.dtype and then define an np.array and set its type to your np.dtype. This provides a way to describe arrays such as your own, not to implicitly convert them like you wanted. It helps np.arrays behave as dicts to help coders write a more clean cut explicit code without a lot of indices that others don't know the meaning of. Look at the tutorial example:

dt = np.dtype([('name', np.str_, 16), ('grades', np.float64, (2,))])
x = np.array([('Sarah', (8.0, 7.0)), ('John', (6.0, 7.0))], dtype=dt)
x[1]
#output:  
        ('John', [6.0, 7.0])

x[1]['grades']
#output   
        array([ 6.,  7.])
ljetibo
  • 3,048
  • 19
  • 25