In [25]: dates = np.arange(np.datetime64('2010-01-01'),np.datetime64('2014-12-31'))
In [26]: dates
Out[26]:
array(['2010-01-01', '2010-01-02', '2010-01-03', ..., '2014-12-28',
'2014-12-29', '2014-12-30'], dtype='datetime64[D]')
In [27]: dates.shape
Out[27]: (1825,)
Like a regular np.arange
(e.g. np.arange(1825)
), this creates a 1d array spanning a range of values. To be clear the array contains numeric values (floats) that represent the date value. The ndarray
code uses the specificed dtype
to interpret those numbers. When displaying the array it renders the data as these data strings (as dictated by the [D]
part of the dtype
).
What sort of columns do you need?
As with any 1d array, you can create a volume vector with:
In [28]: dates[:,None]
Out[28]:
array([['2010-01-01'],
['2010-01-02'],
['2010-01-03'],
...,
['2014-12-28'],
['2014-12-29'],
['2014-12-30']], dtype='datetime64[D]')
alist = dates.tolist()
In [59]: alist[:10]
Out[59]:
[datetime.date(2010, 1, 1),
datetime.date(2010, 1, 2),
datetime.date(2010, 1, 3),
datetime.date(2010, 1, 4),
datetime.date(2010, 1, 5),
datetime.date(2010, 1, 6),
datetime.date(2010, 1, 7),
datetime.date(2010, 1, 8),
datetime.date(2010, 1, 9),
datetime.date(2010, 1, 10)]
produces a list of datetime
objects. We extract year, month, day from those objects iteratively.
In [66]: np.array([[x.year, x.month, x.day] for x in alist])
Out[66]:
array([[2010, 1, 1],
[2010, 1, 2],
[2010, 1, 3],
...,
[2014, 12, 28],
[2014, 12, 29],
[2014, 12, 30]])
np.datetime64
dtype doesn't have the exact equivalent. We can convert them into various units, and take differences.
In [68]: yrs = dates.astype('datetime64[Y]')
In [69]: yrs
Out[69]: array(['2010', '2010', '2010', ..., '2014', '2014', '2014'], dtype='datetime64[Y]')
In [70]: mths = dates.astype('datetime64[M]')-yrs
In [71]: mths
Out[71]: array([ 0, 0, 0, ..., 11, 11, 11], dtype='timedelta64[M]')
In [72]: days = dates - dates.astype('datetime64[M]')
In [73]: days
Out[73]: array([ 0, 1, 2, ..., 27, 28, 29], dtype='timedelta64[D]')
With different dtypes
we can't directly concatenate these. But convert them to ints we can get the same 2d array:
In [76]: np.stack((yrs.astype(int), mths.astype(int), days.astype(int)),axis=1)
Out[76]:
array([[40, 0, 0],
[40, 0, 1],
[40, 0, 2],
...,
[44, 11, 27],
[44, 11, 28],
[44, 11, 29]])
(year needs an offset).
In [77]: np.stack((yrs.astype(int)+1970, mths.astype(int), days.astype(int)),axis=1)
Or a structured array 'stacking':
In [78]: np.rec.fromarrays([yrs, mths, days])
Out[78]:
rec.array([('2010', 0, 0), ('2010', 0, 1), ('2010', 0, 2), ...,
('2014', 11, 27), ('2014', 11, 28), ('2014', 11, 29)],
dtype=[('f0', '<M8[Y]'), ('f1', '<m8[M]'), ('f2', '<m8[D]')])