1

I have a numpy array:

>>> type(dat)
Out[41]: numpy.ndarray

>>> dat.shape
Out[46]: (127L,)

>>> dat[0:3]
Out[42]: array([datetime.date(2010, 6, 11), datetime.date(2010, 6, 19), datetime.date(2010, 6, 30)], dtype=object)

I want to get weekdays for each date in this array like the following:

>>> dat[0].weekday()
Out[43]: 4

I tried using the following but none work:

import pandas as pd
import numpy as np
import datetime as dt

np.apply_along_axis(weekday,0,dat)
NameError: name 'weekday' is not defined

np.apply_along_axis(dt.weekday,0,dat)
AttributeError: 'module' object has no attribute 'weekday'

np.apply_along_axis(pd.weekday,1,dat)
AttributeError: 'module' object has no attribute 'weekday'

np.apply_along_axis(lambda x: x.weekday(),0,dat)
AttributeError: 'numpy.ndarray' object has no attribute 'weekday'

np.apply_along_axis(lambda x: x.dt.weekday,0,dat)
AttributeError: 'numpy.ndarray' object has no attribute 'dt'

Is there something I am missing here?

dayum
  • 1,073
  • 15
  • 31
  • `weekday` is a method of the `datetime.date` object. `apply_along_axis` expects a function. Just do your calculation with a list comprehension. Even if you could get it working, `apply_along_axis` won't speed things up. – hpaulj Jan 30 '18 at 00:33

2 Answers2

2

np.apply_along_axis doesn't make much sense with a 1d array. In a 2d or higher array, it applies the function to 1d slices from that array. Regarding that function:

This function should accept 1-D arrays. It is applied to 1-D slices of arr along the specified axis.

This nameerror is produced even before running apply. You didn't define a weekday function:

np.apply_along_axis(weekday,0,dat)
NameError: name 'weekday' is not defined

weekday is a method of a date, not a function in the dt module:

np.apply_along_axis(dt.weekday,0,dat)
AttributeError: 'module' object has no attribute 'weekday'

It's not defined in pandas either:

np.apply_along_axis(pd.weekday,1,dat)
AttributeError: 'module' object has no attribute 'weekday'

This looks better, but apply_along_axis passes an array (1d) to the lambda. weekday isn't an array method.

np.apply_along_axis(lambda x: x.weekday(),0,dat)
AttributeError: 'numpy.ndarray' object has no attribute 'weekday'

And an array doesn't have a dt attribute either.

np.apply_along_axis(lambda x: x.dt.weekday,0,dat)
AttributeError: 'numpy.ndarray' object has no attribute 'dt'

So let's forget about apply_along_axis.


Define a sample, first as list, and then as object array:

In [231]: alist = [datetime.date(2010, 6, 11), datetime.date(2010, 6, 19), datetime.date(2010, 6, 30)]
In [232]: data = np.array(alist)
In [233]: data
Out[233]: 
array([datetime.date(2010, 6, 11), datetime.date(2010, 6, 19),
       datetime.date(2010, 6, 30)], dtype=object)

And for convenience a lambda version of weekday:

In [234]: L = lambda x: x.weekday()

This can be applied iteratively in several ways:

In [235]: [L(x) for x in alist]
Out[235]: [4, 5, 2]
In [236]: [L(x) for x in data]
Out[236]: [4, 5, 2]
In [237]: np.vectorize(L)(data)
Out[237]: array([4, 5, 2])
In [238]: np.frompyfunc(L,1,1)(data)
Out[238]: array([4, 5, 2], dtype=object)

I just did time tests on a 3000 item list. The list comprehension was fastest (as I expected from past tests), but the time differences were not large. The biggest time consumer was simply running x.weekday() 3000 times.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • alist throws TypeError: descriptor 'date' requires a 'datetime.datetime' object but received a 'int' – Sade May 15 '20 at 11:45
1

You could try vectorizing the weekday function so you can apply it elementwise to arrays.

weekday_func = lambda x: x.weekday()
get_weekday = np.vectorize(weekday_func)
get_weekday(dat)
Ethan Henderson
  • 428
  • 4
  • 11