1

I'm trying to read an XML file into a NumPy record array. Times are in Zulu time, u'2013-06-06T17:47:38Z', and the other columns are floats.

The times and the floats can both be converted into NumPy arrays. However, if I try to make a recordarray, it fails in a variety of ways (which probably indicate that I don't know how to create record array):

In [124]: dataarr = np.array(zip(*[datadict[k] for k in keys]),
   .....:                     dtype=[(k,dtypes[k]) for k in keys])
Traceback (most recent call last):
  File "<ipython-input-124-d59123796cfa>", line 2, in <module>
    dtype=[(k,dtypes[k]) for k in keys])
ValueError: Cannot create a NumPy datetime other than NaT with generic units

In [125]: dataarr = np.array([datadict[k] for k in keys],
                    dtype=[(k,dtypes[k]) for k in keys])
Traceback (most recent call last):
  File "<ipython-input-125-ee9077bf1961>", line 2, in <module>
    dtype=[(k,dtypes[k]) for k in keys])
TypeError: expected a readable buffer object

In [126]: dataarr = np.array([datadict[k] for k in keys],
                    dtype=[dtypes[k] for k in keys])
Traceback (most recent call last):
  File "<ipython-input-126-a456052bdfd4>", line 2, in <module>
    dtype=[dtypes[k] for k in keys])
TypeError: data type not understood

In [127]: dtypes
Out[127]: {'altitude': 'float', 'distance': 'float', 'time': 'datetime64'}

What is the proper approach for creating a recordarray including times?

(keys is a list, datadict and dtype are dicts)

keflavich
  • 18,278
  • 20
  • 86
  • 118
  • possible duplicate of [numpy datetime64 in recarray](http://stackoverflow.com/questions/16618499/numpy-datetime64-in-recarray) – askewchan Jun 10 '13 at 16:15
  • I also recall a comment going by on the matplotlib list that numpy + datetime types was still in flux/buggy. – tacaswell Aug 31 '13 at 21:11
  • @askewchan: I don't think this is a duplicate, since my solution was to *not* use `datetime[D]`. – keflavich Sep 01 '13 at 00:46
  • @tcaswell - it might help to link that comment. However, since I was able to solve this specific problem with the answer below, it probably isn't too buggy for some purposes. – keflavich Sep 01 '13 at 00:47
  • 1
    https://github.com/matplotlib/matplotlib/pull/2199 <- discussion at end – tacaswell Sep 01 '13 at 02:08

1 Answers1

1

Whoops, figured it out using numpy datetime64 in recarray

I tried using datetime[D], which failed:

In [19]: dtypes
Out[19]: {'altitude': 'float', 'distance': 'float', 'time': 'datetime64[D]'}

In [20]: dataarr = np.array(zip(*[datadict[k] for k in keys]),
                    dtype=[(k,dtypes[k]) for k in keys])
Traceback (most recent call last):
  File "<ipython-input-20-d59123796cfa>", line 2, in <module>
    dtype=[(k,dtypes[k]) for k in keys])
TypeError: Cannot cast NumPy timedelta64 scalar from metadata [s] to [D] according to the rule 'same_kind'

but datetime[s] works:

In [22]: dtypes
Out[22]: {'altitude': 'float', 'distance': 'float', 'time': 'datetime64[s]'}

In [23]: dataarr = np.array(zip(*[datadict[k] for k in keys]),
                    dtype=[(k,dtypes[k]) for k in keys])
Community
  • 1
  • 1
keflavich
  • 18,278
  • 20
  • 86
  • 118